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APOPTOSIS MODULATORS THAT INTERACT WITH THE 
HUNTINGTON'S DISEASE GENE 

BACKGROUND OF THE INVENTION 

This application relates to a family of apoptosis modulators that interact with the 
Huntington*s Disease gene product, and to methods and compositions relating thereto. 

"Interacting proteins" are proteins which associate in vivo to form specific complexes. 
5 Non-covalent bonds, including hydrogen bonds, hydrophobic interactions and other 

molecular associations fomi between the proteins when two protein surfaces are matched or 
have affinity for each other. This affinity or match is required for the recognition of the two 
proteins, and the formation of an interaction. Protein-protein interactions are involved in the 
assembly of enzyme subunits; in antigen-antibody reactions; in forming the supramolecular 
10 structures of ribosomes, filaments, and viruses; in transport; and in the interaction of 
receptors on a cell with growth factors and homiones. 

Huntington's disease is an adult onset disorder characterized by selective neuronal loss 
in discrete regions of the brain and spinal chord that lead to progressive movement disorder, 
personality change and intellectual decline. From onset, which generally occurs around age 
15 40, the disease progresses with worsening symptoms, ending in death approximately 18 years 
after onset. 

The biochemical cause of Huntington's disease is unclear. While the biochemical 
cause of Huntington's disease has remained elusive, a mutation in a gene within chromosome 
4pl6.3 subband has been identified and Hnked to the disease. This gene, referred to as the 

20 Huntington's Disease or HD gene, contains two repeat regions, a CAG repeat region and a 

CCG repeat region. Testing of Huntington's disease patients has shown that the CAG region 
is highly polymorphic, and that the number of CAG repeat units in the CAG repeat region is a 
very reliable indicator of having inherited the gene for Huntington's disease. Thus, in control 
individuals and in most individuals suffering from neuropsychiatric disorders other than 

25 Huntington's disease, the number of CAG repeats is between 9 and 35, while in individuals 
suffering from Huntington's disease the number of CAG repeats is expanded and is 36 or 
greater. 
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To date, no differences have been observed at either the total RNA, mRNA or protein 
levels between nonnal and HD-affected individuals. Thus, the function of the HD protein 
and its role in the pathogenesis of Huntington's Disease remain to be elucidated. 

5 SUMMARY OF THE INVENTION 

We have now identified a protein designated as HIPl , that interact differently with 
the gene product of a normal (16 CAG repeat) and an expanded (>44 CAG repeat) HD gene. 
The HIPl protein originally isolated firom a yeast two-hybrid screen is encoded by a 1.2 kb 
cDNA (Seq. ID. No. 1), devoid of stop codons, that is expressed as a 400 amino acid 
10 polypeptide (Seq. ID. No. 2). Subsequent study has elucidated additional sequence for HIPl 
such that a 1090 amino acid protein is now known. (Seq. ID No. 5). Expression of the HIPl 
protein was found to be enriched in the brain. 

Analysis of the sequence of the HIPl protein indicated that it includes a death effector 
domain (DED), suggesting an apoptotic function. Thus, it appears that a normal function of 
15 huntingtin may be to bind HIPl and related apoptosis modulators, reducing its effectiveness 
in stimulating cell death. Since expanded huntingtin performs this function less well, there is 
an increase in HIPl -modulated cell death in individuals with an expanded repeat in the HD 
gene. Furthermore, additional members of the same family of proteins have been identified 
which also contain a DED. Thus, the present invention provides a new class of apoptotic 
20 modulators which are referred to as HIP-apoptosis modulating proteins. 

This understanding of the likely role of huntingtin and HIPl or related proteins in the 
pathology of Huntington's Disease offers several possibilities for therapy. First, because the 
function of huntingtin apparently depends at least in part on the ability to interact with HIP- 
apoptosis modulating proteins, added expression (e.g., via gene therapy) of normal (non- 
25 expanded) huntingtin or of the HIP-binding region of huntingtin should provide a therapeutic 
benefit. Other DED-interacting peptides could also be used to mask and reduce the 
interaction of HIP-apoptosis modulating proteins with the death signaling complex. 
Alternatively, a mutant form of HIP-protein firom which the DED has been deleted might be 
introduced, for example using gene therapy techniques. Because HIP-apoptosis modulating 
30 proteins have been shown to self-associate, a protein with a deleted DED may compete with 
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endogenous HlP-protein in the fonnation of these associations, thereby reducing the amount 
of apoptotically-active HIP-protein. 

BRIEF DESCRIPTION OF THE DRAWING 
5 Fig. 1 graphically depicts the amount of interaction between HIPl and Huntingtin 

proteins with varying lengths of polyglutamine repeat; 

Fig. 2 compares the nucleic acid sequences of human and murine HIPl and HIP la; 

Fig. 3 compares the amino acid sequences of human and murine HIPl and HIP la; 
Fig. 4 shows the sequences of various death effector domains in comparison to the 
1 0 DED of human and murine HIP 1 and HIP 1 a; 

Fig. 5 shows the genomic organization of human HIPl ; 

Fig. 6 compares the sequences of human HIPl with ZK370.3 protein of C elegans; 
Fig. 7 shows mouse EST's with homology to human HIPl cDNA used to screen a 
mouse brain library; 

15 Fig. 8 shows the affect of HIPl on susceptibility of cells to stress; and 

Figs. 9A - 9C show the toxicity of HIPl in the presence of huntingtin with different 
lengths of polyglutamine repeats. 

DETAILED DESCRIPTION OF THE INVENTION 

20 This application relates to a new family of proteins function as modulators of apop- 

tosis. At least some of these proteins, notably the human protein designated HIPl , interact 
with the gene product of the Huntington's disease gene. Other proteins within the family 
possess at least 40% and preferably more than 50% nucleotide identity with HIPl and include 
a death effector domain (DED) . Such proteins are referred to in the specification and claims 

25 hereof as "HlP-apoptosis modulating proteins." 

The first HIP-apoptosis modulating protein identified was designated as HIPl . HIPl 
was identified using the yeast two-hybrid system described in US Patent No. 5,283,173 which 
is incorporated herein by reference. Briefly, this system utilizes two chimeric genes or 
plasmids expressible in a yeast host. The yeast host is selected to contain a detectable marker 

30 gene having a binding site for the DNA binding domain of a transcriptional activator. The 
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first chimeric gene or plasmid encodes a DNA-binding domain which recognizes the binding 
site of the selectable marker gene and a test protein or protein fragment. The second chimeric 
gene or plasmid encodes for a second test protein and a transcriptional activation domain. 
The two chimeric genes or plasmids are introduced into the host cell and expressed, and the 
5 cells are cultivated. Expression of the detectable marker gene only occurs when the gene 
product of the first chimeric gene or plasmid binds to the DNA binding domain of the 
detectable marker gene, and a transcriptional activation domain is brought into sufficient 
proximity to the DNA-binding domain, an occurrence which is facilitated by protein-protein 
interactions between the first and second test proteins. By selecting for cells expressing the 

10 detectable marker gene, those cells which contain chimeric genes or plasmids for interacting 
proteins can be identified, and the gene can be recovered and identified. 

In testing for Huntington Interacting Proteins, several different plasmids were 
prepared containing portions of the human HD gene. The first four, identified as 16pGBT9, 
44pGBT9, 80pGBT9 and 128pGBT9, were GAL4 DNA binding domain-HD in-frame 

15 fusions containing nucleotides 314 to 1955 (amino acids 1-540) of the pubUshed HD cDNA 
sequences cloned into the vector pGBT9 (Clontech). These plasmids contain a CAG repeat 
region of 16, 44, 80 and 128 glutamine-encoding repeats, respectively. A clone (DMK 
BamHIpGBT9) was made by fusing a cDNA encoding the first 544 amino acids of the 
myotonic dystrophy gene (a gift from R. Komeluk) in-frame with the GAL4-DNA BD of 

20 pGBT9 and was used as a negative control. 

These plasmids have been used to identify and characterize HIPl, as well as two 
additional HD-interacting proteins, HIP2 and HIP3, which have not yet been tested for 
function as apoptosis modulators. These plasmids can be further used for the identification of 
additional interacting proteins which do act as apoptosis modulators, and for tests to refine 

25 the region on the protein in which the interaction occurs. Thus, one aspect of the invention is 
these four plasmids, and the use of these plasmids in identifying HD-interacting proteins. 
Furthermore, it will be appreciated that the GAL4 DNA-binding and activating domains are 
not the only domains which can be used in the yeast two-hybrid assay. Thus, in a broader 
sense, the invention encompasses any chimeric genes or plasmids containing nucleotides 314 

30 to 1955 of the HD gene together with an activating or DNA-binding domain suitable for use 
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in the yeast one, two- or three-hybrid assay for proteins critical in either binding to the HD 
protein or responsible for regulated expression of the HD gene. 

After introducing the plasmids into Y190 yeast host cells, transforming the host cells 
with an adult human brain Matchmaker^'^ (Clontech) cDNA library coupled with a GAL4 
5 activating domain, and selecting for the expression of two detectable marker genes to identify 
clones containing genes for interacting proteins, the activating domain plasmids were 
recovered and analyzed. As a result of this analysis, three different cDNA fragments were 
identified as encoding for HD-interacting proteins and designated as HIPl, HIP2 and HIP3. 
The nucleic acid sequence of HIPl, as originally recovered in the yeast two-hybrid assay, is 

10 given in Seq. ID. No 1 . The polypeptide which it encodes is given by Seq. ID No. 2. Further 
investigation of the HIPl cDNA resulted in tlie characterization of a longer region of cDNA 
totaling 4795 bases and a corresponding protein, the sequences of which are given by Seq ID 
Nos. 3 and 4, respectively. A further portion of the HIPl protein was characterized, 
extending the length to the complete protein sequence of 1090 amino acids (Seq. ID No. 5) 

15 The cDNA molecules encoding HIP-apoptosis modulating proteins, particularly those 

encoding portions of HIPl, can be explored using oligonucleotide probes for example for 
amplification and sequencing. In addition, oligonucleotide probes complementary to the 
cDNA can be used as diagnostic probes to localize and quantify the presence of HIPl DNA. 
Probes of this type with a one or two base mismatch can also be used in site-directed 

20 mutagenesis to introduce variations into the HIPl sequence which may increase or decrease 
the apoptotic activity. Preferred targets for such mutations would be the death effector 
domains. Thus, a further aspect of the present invention is an oligonucleotide probe, 
preferably having a length of from 15-40 bases which specifically and selectively hybridizes 
with the cDNA given by Seq. ID No. 1 or 3 or a sequence complementary thereto. As used 

25 herein, the phrase "specifically and selectively hybridizes with" the cDNA refers to primers 
which will hybridize with the cDNA under stringent hybridization conditions. 

Probes of this type can also be used for diagnostic purposes to characterize risk of 
Huntington's Disease like symptoms arising in individuals where the symptoms are present in 
the family history but are not associated with an expansion of the CAG repeat. Such 

30 symptoms may arise from a mutation in HIPl or other HIP-apoptosis modulating protein 
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which alters the interaction of this protein with huntingtin, thereby increasing the apoptotic 
activity of the protein even in the presence of a normal (non-expanded) huntingtin molecule. 
An appropriate probe for this purpose would one which hybridizes with or adjacent to the 
huntingtin binding region of the HIP-apoptosis modulating protein. In HIPl, this lies within 



DNA sequencing of the HIPl cDNA initially isolated from the yeast two-hybrid 
screen (Seq. ID No. 1) revealed a 1 .2 kb cDNA that shows no significant degree of nucleic 
acid identity with any stretch of DNA using the blastn program at ncbi 
(blast@ncbi.nlm.nih.gov). When the larger HIPl cDNA sequence (SEQ ID NO. 3) was 

10 translated into a polypeptide, the HIPl cDNA coding (nucleotides 328-3069) is observed to 
be devoid of stop codons, and to produce a 914 amino acid polypeptide. A polypeptide 
identity search revealed an identity match over the entire length of the protein (46% 
conservation) with that of a hypothetical protein from C elegans (ZK370.3 protein; C. 
elegans cosmid ZK370). This C, elegans protein shares identity with the mouse talin gene, 

1 5 which encodes a 2 1 7 kDa protein implicated with maintaining integrity of the cytoskeleton. 
It also shares identity with the SLA2/MOP2/ END4 gene from Saccharomyces cerevisiae, 
which is known to code for an essential cytoskeletal associated gene required for the 
accumulation and or maintenance of plasma membrane H"- ATPase on the cell surface. 
When pairwise comparisons are performed between HIPl and the C. elegans ZK370.3 protein 

20 (Genpept accession number celzk370.3), it shows 26% complete identity and an overall 46% 
level of conservation. Comparative analysis between HIPl and SLA2/MOP2/ END4 (EMBL 
accession number Z2281 1) demonstrate similar conservation (20% identity, 40% 
conservation). 



found that the native interaction between HD protein and HIPl is influenced by the number 
of CAG repeats. Second, it was found that expression of the HIPl protein is enriched in the 
brain. The highest amounts of expression are in the cortex, with lower levels being seen in 
the cerebellum, caudate and putamen. 



5 



amino acids 129-514. 



25 



Further exploration revealed several important facts about HIPl that implicate it in a 
significantly in the pathogenesis of Huntington's Disease. First, as shown in Fig. 1, it was 
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It has also been observed that huntingtin proteins with expanded polyglutamine tracts 
can aggregate into large, irregularly shaped deposits in HD brains, transgenic mice and in 
vitro cell culture. We have shown that in HEK (human embryonic kidney) 293T cells, the 
aggregation of full-length and smaller huntingtin fragments occurs after the cells have been 
5 exposed to a period of apoptotic stress. Martindale, et aL, Nature Genetics 18: 150-154 
(1998). In order to assess the consequence of HIP 1 expression in cultured cells, we used 
huntingtin aggregation as one marker of viability. What we found was that cells 
cotransfected with huntingtin (128 CAG repeats) and fflPl contained aggregates comparable 
to those observed following application of apoptotic stress with sub-lethal doses of tamoxifen 

10 in 14% of the cells, and that these cells were the ones in which both genes had been 

introduced as reflected by a double marker experiment. Transfection of a gene encoding a 
fusion protein of 128 repeat huntingtin and the DED domain from HIPl ligated in the sense 
orientation resulted in aggregate formation in 30 to 50% of the cells. 

The implications of the apoptotic activity of HIPl are two-fold. First, the fact that 

1 5 this activity is apparently differentially modulated by interaction with huntingtin having 
normal and expanded repeats implicates HIPl in the apoptotic neuronal death which is 
observed in Huntington's disease and makes HIPl a logical target for therapy. A second 
implication of the apoptotic activity of HIPl is the potential for use of HIPl as a therapeutic 
agent to introduce apoptosis in cancer cells. 

20 Therapeutic targeting of HIPl or other HIP-apoptosis modulating proteins might take 

any of several foniis, but will in general be a treatment involving administration of a 
composition that reduces the apoptotic activity of the HIP-apoptosis modulating protein. As 
used in the specification and claims hereof, the term "administration" includes direct 
administration of a composition active to reduce apoptotic activity as well as indirect 

25 administration which might include administration of pro-drugs or nucleic acids that encode 
the desired therapeutic composition. 

One class of composition which can be used in the therapeutic methods of the 
inyention are those compositions which interfere with the activity of HIP-apoptosis 
modulating proteins by binding to the proteins and mask and reduce the interaction of HIP- 

30 apoptosis modulating proteins with the death signaling complex. Within this class of 
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compositions are normal (non-expanded) huntingtin, administered, for example, via increased 
expression of exogenous HD genes; the HIP-binding region of huntingtin, administered via 
gene therapy techniques; and other DED-interacting peptides. Other DED-interacting 
peptides which might be used in a therapeutic method of this type include FADD (Beldin et 
5 al.. Cell 85: 803-815 (1996)) and caspase 8 (Muzio et al.. Cell 85: 817-827 (1996). 

An alternative form of therapy involves the use of a mutant forni of HIP 1 or other 
HIP-apoptosis modulating protein from which the DED has been deleted. DED-containing 
proteins, including HIPl are self-associating, and this self-association has been shown to be 
important for activity. (Muzio et al., Cell 85: 817-827 (1996). Thus, a protein with a deleted 

10 DED may compete with endogenous HIP-protein in the formation of these associations, 
thereby reducing the amount of apoptotically-active HIP-protein. 

In addition to HIP 1 , we have identified a further human protein, HIP 1 a, from a 
human frontal cortex cDNA library. HIP la is a family member of HIPl, and thus a HIP- 
apoptosis modulator in accordance with the invention. A partial sequence of HIP la (the 5' 

15 portion of HIPla remains to be characterized) is given by SEQ ID Nos. 6 and 7. The isolated 
and characterized portion of HIPla shows 53% nucleotide identity and 58% amino acid 
conservation with HIPl (Table 1, Figs. 2 and 3). 

We have also isolated 2 mouse proteins mHIPl and niHIPla (SEQ. ID Nos. 8-11) 
which appear to be the murine homologues of human HIPl and HIPla. As in the case of 

20 human HIPl a, the 5' portion of mHIPl remains to be isolated. At present, niHIP 1 shows 85% 
nucleotide identity and 90% amino acid conservation with huHIPl (Table 1, Figs. 2 and 3). 
mHIPl a shows 60% nucleotide identity and 61% amino acid conservation with huHIPl 
(Table 1, Figs. 2 and 3). mHIPl a shows stronger homology to huHIPla; it shows 87% 
nucleotide identity and 91% amino acid conservation with huHIPla (Table 1, Figs. 2 and 3). 

25 Taken together these findings indicate that mHIPl is the murine homologue of huHIPl 

whereas niHIPla is most likely the murine homologue of huHIPla. As mentioned previously, 
HIPl shows sequence similarity to Sla2p in S. cerevisiae and the hypothetical protein 
ZK370.3 in C. elegans. Similarly, huHIPla, mHIPl, and mHIPla show sequence similar to 
Sla2p and ZK370.3 (Table 2). The carboxy-terminal regions of huHIPla, mHIPl, and 

30 mHIPla all show considerable homology to the mammalian membrane 
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cytoskeletal-associated protein, talin. This suggests that these 3 proteins may also play a role 
in the regulation of membrane events through interactions with the underlying cytoskeleton. 

HIPl contains a death effector domain (DED), a domain which is also present in a 
number of proteins involved in the apoptotic pathway (Fig. 4). This suggests that HIPl may 
5 act as a modulator of the apoptosis pathway. The DED in huHIPl is present between amino 
acid positions 287 and 368. Similarly, HIPla, mHIPl, and mHIPla also contain a DED. In 
huHIPla the DED is present at amino acids 1-78 of the recovered fragment. In mHIPl and 
mHIPla, the DED are present at amino acids 128- 210 and 388-470, respectively. The DED 
present in huHIPla, mHIPl and mHIPla all show significant percentage amino acid 

10 conservation to the DED present in huHIPl (Table 3). 

Increasing expression of normal (non-expanded) huntingtin or the HIP-apoptotic 
modulator-binding portion thereof, a modified HIP-apoptotic modulator in which the DED 
has been deleted or of a DED-interacting protein or peptide can be accomplished using gene 
therapy approaches. In general, this will involve introduction of DNA encoding the 

15 appropriate protein or peptide in an expressable vector into the brain cells. Expression of 
HIP-apoptosis modulating proteins may also be useful in treatment of cancer in which case 
application to other cell types would be desired, and cells expressing HIP-apoptosis 
modulating proteins may be used for screening of therapeutic compounds. Thus, in a more 
general sense, expression vectors are defined herein as DNA sequences that are required for 

20 the transcription of cloned copies of genes and the translation of their mRNAs in an 

appropriate cell type. Specifically designed vectors allow the shuttling of DNA between 
hosts such as bacteria-yeast or bacteria-animal cells. An appropriately constructed expression 
vector may contain: an origin of replication for autonomous replication in host cells, 
selectable markers, a limited number of useful restriction enzyme sites, a potential for high 

25 copy number, and active promoters. A promoter is defined as a DNA sequence that directs 
RNA polymerase to bind to DNA and initiate RNA synthesis. A strong promoter is one 
which causes niRNAs to be initiated at high frequency. Expression vectors may include, but 
are not limited to, cloning vectors, modified cloning vectors, specifically designed plasmids 
or viruses. 

30 A variety of mammalian expression vectors may be used to express recombinant 
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HlP-apoptosis modulating proteins or fragments thereof in mammalian cells. Commercially 
available mammalian expression vectors which may be suitable for recombinant HIP- 
apoptosis modulating protein expression, include but are not limited to, pMClneo 
(Stratagene), pXTl (Stratagene), pSG5 (Stratagene), EBO-pSV2-neo (ATCC 37593) 
5 pBPV-l(8-2) (ATCC 371 10), pdBPV-MMTneo(342-12) (ATCC 37224), pRSVgpt (ATCC 
37199), pRSVneo (ATCC 37198), pSV2-dhfr (ATCC 37146), pUCTag (ATCC 37460), and 
1ZD35 (ATCC 37565). Other vectors which have been shown to be suitable expression 
systems in mammalian cells include the herpes simplex viral based vectors: pHSVl (Geller 
etal. Proc, Natl. Acad. Sci 87:8950-8954 (1990)); recombinant retroviral vectors: MFC 

10 (JaffeeetaL Cancer Res. 53:2221-2226 (1993)); Moloney-based retroviral vectors: LN, 

LNSX, LNCX, LXSN (Miller and Rosman Biotechniques 7:980-989 (1989)); vaccinia viral 
vector: MVA (Sutter and Moss Proc. Natl. Acad. Sci. 89:10847-10851 (1992)); recombin- 
ant adenovirus vectors : pJM17 (Ali et al Gene Therapy 1 :367-384 (1994)), (Berkner K. L. 
Biotechniques 6:616-624 1988); second generation adenovirus vector: DE1/DE4 adenoviral 

15 vectors (Wang and Finer Nature Medicine 2:714-716 (1996) ); and Adeno-associated viral 
vectors: AAV/Neo (Muro-Cacho et al. J. Immunotherapy 11:231-237(1992)). 

The expression vector may be introduced into host cells via any one of a number of 
techniques including but not limited to transformation, transfection, infection, protoplast 
fusion, and electroporation. The expression vector-containing cells are clonally propagated 

20 and individually analyzed to determine whether they produce the desired protein. Delivery of 
retroviral vectors to brain and nervous system tissue has been described in US Patents Nos. 
4,866,042, 5,082,670 and 5,529,774, which are incorporated herein by references. These 
patents disclose the use of cerebral grafts or implants as one mechanism for introducing 
vectors bearing therapeutic gene sequences into the brain, as well as an approach in which the 

25 vectors are transmitted across the blood brain barrier. 

To further illustrate the methods of making the materials which are the subject of this 
invention, and the testing which has established their utility, the following non-limiting 
experimental procedures are provided. 
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EXAMPLE 1 

IDENTIFICATION OF INTERACTING PROTEINS 
GAL4-HD cDNA constructs 

An HD cDNA construct (44pGBT9), with 44 CAG repeats was generated 
5 encompassing amino acids 1 - 540 of the published HD cDNA , This cDNA fragment was 
fused in frame to the GAL4 DNA-binding domain (BD) of the yeast two-hybrid vector 
pGBT9 (Clontech). Other HD cDNA constructs, 16pGBT9, 80pGBT9 and 128pGBT9 were 
constructed, identical to 44pGBT9 but included only 16, 80 or 128 CAG repeats, 
respectively. 

10 Another clone (DMKDBamHIpGBT9) containing the first 544 amino acids of the 

myotonic dystrophy gene (a gift from R. Komeluk) was fused in-frame with the GAL4-DNA 
BD of pGBT9 and was used as a negative control. Plasmids expressing the GAL4-BDRAD7 
(D. Gietz, unpublished) and SIR3 were used as a positive control for the p-galactosidase filter 
assay, 

15 The clones IT15-23Q, IT15-44Q and HAPl were generous gifts from Dr. C. Ross. 

These clones represent a previously isolated huntingtin interacting protein that has a higher 
affinity for the expanded forni of the HD protein. 

Yeast strains, transformations and P-galactosidase assavs 
20 The yeast strain Y190 (MATa leu2-3,l 12, ura3-52, trpl-901, his3-A200, ade2-101, 

gal4Agal80A, URA3::GAL-lacZ, LYS2::GAL-HIS3,cycO was used for all transformations 
and assays. Yeast transformations were performed using a modified lithium acetate 
transformation protocol and grown at 30 C using appropriate synthetic complete (SC) dropout 
media. 

25 The p-galactosidase chromogenic filter assays were performed by transferring the 

yeast colonies onto Whatman filters. The yeast cells were lysed by submerging the filters in 
liquid nitrogen for 15-20 seconds. Filters were allowed to dry at room temperature for at 
least five minutes and placed onto filter paper presoaked in Z-buffer (1 00 mM sodium 
phosphate (pH7.0) 10 mM KCl, 1 mM MgSO^) supplemented with 50 mM 
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2-mercaptoethanol aiid 0.07 mg/ml 5-bromo-4-chloro-3-indolyl p-D-galactoside (X-gal). 
Filters were placed at 37 C for up to 8 hours. 

Yeast two-hvbrid screening for huntingtin interacting protein (HIP) 

5 cDNAs from an human adult brain MatchniakerT'^ cDNA library (Clontech) was 

transformed into the yeast strain Y190 already harboring the 44pGBT9 construct. The 
transformants were plated onto one hundred 150 mm x 15 mm circular cuhure dishes 
containing SC media deficient in Trp, Leu and His. The herbicide 3-amino-triazole (3-AT) 
(25mM) was utihzed to limit the number of false His+ positives (31). The yeast 

10 transfonnants were placed at 30 C for 5 days and p-galactosidase filter assays were performed 
on all colonies found after this time, as described above, to identify p-galactosidase+ clones. 
Primary His+/p-galactosidase+ clones were then orderly patched onto a grid on SC 
-Trp/-Leu/-His (25 mM 3 AT) plates and assayed again for His+ growth and the abiUty to turn 
blue with a filter assay. Secondary positives were identified for further analysis. Proteins 

15 encoded by positive cDNAs were designated as HIPs (Huntingtin Interactive Proteins). 

Approximately 4.0 x 10^ Trp/Leu auxotrophic transformants were screened and of 14 clones 
isolated 12 represented the same cDNA (HIPl), and the other 2 cDNAs, HIP2 and HIP3 were 
each represented only once. 

The HIP cDNA plasmids were isolated by growing the His+/p-galactosidase+ colony 

20 in SC -Leu media overnight, lysing the cells with acid-washed glass beads and 

electroporating the bacterial strain, KC8 (leuB auxotrophic) with the yeast lysate. The KC8 
ampicillin resistant colonies were replica plated onto M9 (-Leu) plates. The plasmid DNA 
from M9+ colonies was transformed into DH5-a for further manipulation. 

25 EXAMPLE 2 

CONFIRMATION OF INTERACTIONS 
The HIP1-GAL4-AD cDNA activated both the lac-Z and His reporter genes in the 
yeast strain Y190 only when co-transformed with the GAL4-BD-HD construct, but not the 
negative controls (Fig. 1) of the vector alone or a random fusion protein of the myotonin 
30 kinase gene. In order to assess the influence of the polyglutamine tract on the interaction 
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between HIPl and HD, semi-quantitative (J-galactosidase assays were performed. 
GAL4-BD~HD fusion proteins with 16, 44, 80 and 128 glutamine repeats were assayed for 
their strength of interaction with the GAL4- AD-HIP 1 fusion protein. 

Liquid P-galactosidase assays were performed by inoculating a single yeast colony 
5 into appropriate synthetic complete (SC) dropout media and grown to OD600 0.6-1.5. Five 
millilitres of overnight culture was pelleted and washed once with 1 ml of Z-Buffer, then 
resuspended in 100 ml Z-Buffer supplemented with 38 mM 2-mercaptoethanol, and 0.05% 
SDS. Acid washed glass beads (-100 ml) were added to each sample and vortexed for four 
minutes, by repeatedly alternating a 30 seconds vortex, with 30 seconds on ice. Each sample 

10 was pelleted and 10 ml of lysate was added to 500 ml of lysis buffer. The samples were 
incubated in a 30 C waterbath for 30 seconds and then 1 00 ml of a 4 mg/ml o-nitrophenyl 
b-D galactopyranoside (ONPG) solution was added to each tube. The reaction was allowed 
to continue for 20 minutes at 30 C and stopped by the addition of 500 ml of 1 M Na2C03 and 
placing the samples on ice. Subsequently, OD420 was taken in order to calculate the p- 

15 galactosidase activity with the equation 1000 x OD420/(t x V x OD600) where t is the 
elapsed time (minutes) and V is the amount of lysate used. 

The specificity of the HIPl -HD interaction can be observed using the chromogenic 
filter assay. Only yeast cells harboring HIP 1 and HD activate both the HIS and lacZ reporter 
genes in the Y190 yeast host. The cells that contain the HlPl with HD constructs with 80 or 

20 128 CAG repeats turn blue approximately 45 minutes after the cells with the smaller sized 
repeats (16 or 44). 

No difference in the P-galactosidase activity was observed between the 16 and 44 
repeats or between the 80 and 128 repeats. However, a significant difference (p<0.05) in 
activity is seen between the smaller repeats (16 and 44) and the larger repeats (80 and 128). 
25 (Figure 1) 

EXAMPLE 3 

DNA SEQUENCING. cDNA ISOLATION AND 5' RACE 
Oligonucleotide primers were synthesized on an ABl PCR-mate oligo-synthesizer. 
30 DNA sequencing was performed using an ABI 373 fluorescent automated DNA sequencer. 
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The HIP cDNAs were confirmed to be in-frame with the GAL4-AD by sequencing across the 
AD-HIP 1 cloning junction using an AD oligonucleotide (5'GAA GAT ACC CCA CCA 
AAC3*). (Seq. IDNo. 12) 

Subsequently, primer walking was used to determine the remaining sequences. A 
5 human frontal cortex >4.0 kb cDNA library (a gift from S. Montal) was screened to isolate 
the full length HIPl gene. Fifty nanograms of a 558 base pair Eco RI fragment from the 
original HIPl cDNA was radioactively labeled with [a^^P]-dCTP using nick-translation and 
the probe allowed to hybridized to filters containing >105 pfu/ml of the cDNA library 
overnight at 65 °C in Church buffer (see Northern blot protocol). The filters were washed at 

10 65^C for 10 minutes with 1 X SSPE, 15 minutes at 65 C with 1 X SSPE and 0.1% SDS, then 
for thirty minutes and fifteen minutes with 1 X SSPE and 0.1% SDS. The filters were 
exposed to X-ray film (Kodak, XAR5) overnight at -70 C. Primary positives were isolated 
and replated and subsequent secondary positives were hybridized and washed as for the 
primary screen. The resulting positive phage were converted into plasmid DNA by 

15 conventional methods (Stratagene) and the cDNA isolated and sequenced. 

In order to obtain the most 5' sequence of the HIPl gene, a Rapid Amplification of 
cDNA Ends (RACE) protocol was perfomied according to the manufacturers 
recommendations (BRL). First strand cDNA was synthesized using the oligo HIP1-242R (5' 
GCT TGA CAG TGT AGT CAT AAA GGT GGC TGC AGT CC 3'). (Seq, ID No. 13) 

20 After dCTP tailing the cDNA with terminal deoxy transferase, two rounds of 35 cycles 
(94°C 1 minute; 53°C 1 minute; 72^C 2 minutes) of PCR using HIP1-R2 (5* GGA CAT 
GTC CAG GGA GTT GAA TAC 3') (Seq. ID No. 14) and an anchor primer (5' (CUA)4 
GGC CAC GCG TCG ACT AGT ACG GGI IGG GII GGG IIG3') (BRL ,Seq. ID No. 15)) 
were performed. The subsequent 650 base pair PCR product was cloned using the TA 

25 cloning system (Invitrogen) and sequenced using T3 and T7 primers. Sequences ID Nos. 1 
and 3 show the sequence of the HIPl cDNAs obtained. 
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EXAMPLE 4 
DNA AND AMINO ACID ANALYSES 
Overlapping DNA sequence was assembled using the program Mac Vector and sent 
via email or Netscape to the BLAST server at NIH (http://www.ncbi.nlm.nih.gov) to search 
5 for sequence similarities with known DNA (blastn) or protein (tblastn) sequences. Amino 
acid alignments were performed with the program Clustalw. 

EXAMPLE 5 

FISH DETECTION SYSTEM AND IMAGE ANALYSIS 
10 The HIPl cDNA isolated from the two-hybrid screen was mapped by fluorescent in 

situ hybridization (FISH) to normal human lymphocyte chromosomes counterstained with 
propidium iodide and DAPI. Biotinylated probe was detected with avidin-fluorescein 
isothiocyanate (FITC). Images of metaphase preparations were captured by a 
thermoelectrically cooled charge coupled camera (Photometries). Separate images of DAPI 
15 banded chromosomes and FITC targeted chromosomes were obtained. Hybridization signals 
were acquired and merged using image analysis software and pseudo colored blue (DAPI) 
and yellow (FITC) as described and overlaid electronically. This study showed that HIPl 
maps to a single genomic locus at 7qlL2. 

20 EXAMPLE 6 

NORTHERN BLOT ANALYSIS 
RNA was isolated using the single step method of homogenization in guanidinium 
isothiocyante and fractionated on a 1 .0% agarose gel containing 0.6 M formaldehyde. The 
RNA was transferred to a hybond N -membrane (Amersham) and crosslinked with ultraviolet 
25 radiation. 

Hybridization of the Northern blot with b-actin as an internal control probe 
provided confirmation that the RNA was intact and had transferred. The 1 .2 kb HIPl cDNA 
was labeled using nick translation and incorporation of a^^P-dCTP, Hybridization of the 
original 1 .2 kb HIPl cDNA was carried out in Church buffer (0.5 M sodium phosphate 
30 buffer, pH 7.2, 2.7% sodium dodecyl sulphate, 1 niM EDTA) at 55 C overnight. Following 
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hybridization, Northern blots were washed once for 10 minutes in 2,0 X SSPE, 0.1% SDS at 
room temperature and twice for 10 minutes in 0.15 X SSPE, 0.1% SDS. Autoradiography 
was carried our from one to three days using Hyperfilm (Amersham) film at -70 C. 

Analysis of the levels of RNA levels of HIP 1 by Northern blot data revealed that the 
5 10 kilo base HIPl message is present in all tissue assessed. However, the levels of RNA are 
not uniform, with brain having highest levels of expression and peripheral tissues having less 
message. No apparent differences in RNA expression was noted between control samples 
and HD affected individuals. 

10 EXAMPLE 7 

TISSUE LOCALIZATION OF HIPl 
Tissue localization of HIPl was studied using a variety of techniques as described 
below. Subcellular distribution of HIP-1 protein in adult human and mouse brain Biochem- 
ical fractionation studies revealed the HIPl protein was found to be a membrane-associated 

15 protein. No immunoreactivity was seen by Western blotting in cytosolic fractions, using the 
anti-HIPl-pepl polyclonal antibody. HIPl immunoreactivity was observed in all membrane 
fractions including nuclei (PI), mitochondria and synaptosomes (P2), microsomes and 
plasma membranes (P3). The P3 fraction contained the most HIPl compared to other 
membrane fractions. HIPl could be removed from membranes by high sah (0.5M NaCl) 

20 buffers indicating it is not an integral membrane protein, however, since low salt (0.1- 0.25M 
NaCl) was only able to partially remove HIPl from membranes, its membrane association is 
relatively strong. The extraction of P3 membranes with the non-ionic detergent, Triton 
X-100 revealed HIPl to be a Triton X-100 insoluble protein. This characteristic is shared by 
many cytoskeletal and cytoskeletal-associated membrane proteins including actin, which was 

25 used as a control in this study. The biochemical characteristics of HIPl described were found 
to be identical in mouse and human brain and was the same for both forms of the protein 
(both bands of the HIPl doublet). HIPl co-localized with huntingtin in the P2 and P3 
membrane fractions, including the high-salt membrane extractions, as well as in the Triton 
X-100 insoluble residue. The subcellular distribution of HIPl was unaffected by the 
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expression of polyglutamine-expanded huntingtin in transgenic mice and HD patient brain 
samples. 

The localization of HIP 1 protein was further investigated by immunohistochemistry in 
normal adult mouse brain tissue. Immunoreactivity was seen in a patchy, reticular pattern in 
5 the cytoplasm, appeared excluded from the nucleus and stained most intensely in a 

discontinuous pattern at the membrane. These results are consistent with the association of 
HIPl with the cytoskeletal matrix and further indicate an enrichment of HIP 1 at plasma 
membranes. Immunoreactivity occurred in all regions of the brain, including cortex, 
striatum, cerebellum and brainstem, but appeared most strongly in neurons and especially in 
10 cortical neurons. As described previously, huntingtin immunoreactivity was seen exclusively 
and uniformly in the cytosol. 

The in situ hybridization studies showed HIPl niRNA to be ubiquitously and 
generally expressed throughout the brain. This data is consistent with the immunohisto- 
chemical results and was identical to the distribution pattern of huntingtin mRNA in 
1 5 transgenic mouse brains expressing full-length human huntingtin. 

Protein Preparation And Western Blotting For Expression Studies 

Frozen human tissues were homogenized using a Polytron in a buffer containing 
0.25M sucrose, 20mM Tris-HCl (pH 7.5), lOmM EGTA, 2mM EDTA supplemented with 

20 lOug/ml of leupeptin, soybean trypsin inhibitor and ImM PMSF, then centrifuged at 

4,000rpm for 10* at 4 C to remove cellular debris. 100-150ug/lane of protein was separated on 
8% SDS-PAGE mini-gels and then transferred to PVDF membranes. Huntingtin and HIPl 
were electroblotted overnight in Towbin*s transfer buffer (25 mM Tris-HCl, 0.1 92M glycine, 
pH8,3, 10% methanol) at 30V onto PVDF membranes (Immobilon-P, Millipore) as described 

25 (Towbin et al, Proc. Nat 7 Acad. Sci. (USA) 76: 4350-4354 ( 1 979)). Membranes were blocked 
for 1 hour at room temperature in 5% skim milk/ TBS (lOmM Tris-HCl, 0.1 5M NaCl, 
pH7.5). Antibodies against huntingtin (pAb BKPl, 1:500), actin (mAb A-4700, Sigma, 
1 :500) or HIP 1 (pAb HIP-pep 1 , 1 :200) were added to blocking solution for 1 hour at room 
temperature. After 3 x 10 minutes washes in TBS-T (0.05% Tween-20/TBS), secondary Ab 

30 (horseradish peroxidase conjugated IgG, Biorad) was applied in blocking solution for 1 hour 
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at room temperature. Membranes were washed and then incubated in chemiluminescent ECL 
solution and visualized using Hyperfilm-ECL film (Amersham). 

Generation of Antibodies 

5 The generation of huntingtin specific antibodies GHMl and BKPl is described 

elsewhere (Kalchman, et al., J, Biol. Chem. Ill: 19385-19394 (1996)). The HIPl peptide 
(VLEKDDLMDMDASQQN, a.a. 76-91 of Seq. ID No. 2) was synthesized with Cys on the 
N-terminus for the coupling, and coupled to Keyhole limpet hemocyanin (KLH) (Pierce) 
with succinimidyl 4-(N-maleimidomethyl) cyclohexame- 1 -carboxylate (Pierce). Female 

10 New Zealand White rabbits were injected with HIPl peptide-KLH and Freund's adjuvant. 
Antibodies against the HIPl peptide were purified from rabbit sera using affinity column 
with low pH elution. Affinity column was made by incubation of HIPl peptide with 
activated thio-Sepharose (Pharmacia). 

Western blotting of various peripheral and brain tissues were consistent with the RNA 

15 data. The HIPl protein levels observed was not equivalent in all tissues. The protein 

expression is predominant in brain tissue, with highest amounts seen in the cortex and lower 
levels seen in the cerebellum and caudate and putamen. 

More regio-specific analysis of HIPl expression in the brain revealed no differential 
expression pattern in affected individuals when compared to normal controls, with highest 

20 levels of expression seen in both controls and HD patients in the cortical regions. 

EXAMPLE 8 

CO.IMMUNOPRECIPITATION OF HIPl WITH HUNTINGTIN 
Confirmation of the HD-HIPl interaction was performed using coimmunoprepitation 
as follows. Control human brain (fi-ontal cortex) lysate was prepared in the same manner as 
25 for subcellular localization study. Prior to immunoprecipitation, tissue lysate was 

centrifuged at 5000 rpm for 2 minutes at 4 C, then the supernatant was pre-cleared by the 
incubated with excess amount of Protein A-Sepharose for 30 minutes at 4''C, and 
centrifuged at the same condition. Fifty microlitres of supernatant (500 nig protein) was 
incubated with or without antibodies (10 ug of anti-huntingtin GHMl (Kalchman, et al. 1996) 
30 or anti-synaptobrevin antibody) in the total 500 ul of incubation buffer (20mM Tris-Cl 
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(pH7.5), 40mM NaCl, ImM MgCls) for 1 hour at 4°C. Twenty microlitres of Protein 
A-Sepharose (1:1 suspension, for GHMl and no antibody control) or Protein G-Sepharose 
(for anti-synaptobrevin antibody; Pharmacia) was added and incubated for 1 hour at 4°C. 
The beads were washed with washing buffer (incubation buffer containing 0.5 % Triton 
5 X-100) three times. The samples on the beads were separated using SDS-PAGE (7.5% 

acrylamide) and transferred to PVDF membrane (Immobilon-P, Millipore). The membrane 
was cut at about 1 50 kDa after transfer for Western blotting (as described above). The upper 
piece was probed with an ti-huntingtinBKPl (1/1000) and lower piece with anti-HIPl 
antibody (1/300). 

10 The results showed that when an anti-HIPl polyclonal antibody was immunoreacted 

against a blot containing the GHMl immunoprecipitates from the brain lysate a doublet was 
observed at approximately 100 kDa, When GHMl was immunoreacted against the same 
immunoprecipitate the 350 kDa HD protein was also seen The specificity of the HD-HIPI 
interaction is seen as no immunoreactive bands seen are as a result of the proteins adsorbing 

15 to the Protein- A-Sepharose (Lysate + No Antibody) or when a random, non related antibody 
(Lysate + anti-Synaptobrevin) is used as the immunoprecipitating antibody. 



EXAMPLE 9 

Subcellular fractionation of brain tissue 
20 Cortical tissue (20-100 mg/ml) was homogenized, on ice, in a 2 ml pyrex-teflon 

IKA-RW15 homogenizer (Tekmar Company) in a buffer containing 0.303M sucrose, 20mM 
Tris-HCl pH 6.9, ImM MgCl2, 0.5mM EDTA, ImM PMSF, ImM leupeptin, soybean 
trypsin inhibitor and ImM benzamidine (Wood et ah, Human Molec, Genet. 5: 481-487 
(1996)). 

25 Crude membrane vesicles were isolated by two cycles of a three-step differential 

centrifugation protocol in a Beckman TLA 120.2 rotor at 4 C based on the methods of Wood 
et al (1996). The first step precipitated cellular debris and nuclei from tissue homogenates for 
5 minutes at 1300 x g (PI). The 1300 x g supernatant was subsequently centrifuged for 20 
minutes at 14 000 x g to isolate synaptosomes and mitochondria (P2). Finally, microsomal 
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and plasma membrane vesicles were collected by a 35 minute centrifugation at 142 000 x g 
(P3). The remaining supernatant was defined as the cytosoHc fraction. 

High salt extraction of membranes 
5 Ahquots of P3 membranes were twice suspended at 2mg/ ml in 0.5M NaCl, 1 OmM 

Tris-HCl, 2mM MgCU, pH7,2, containing protease inhibitors (see above). The same buffer 
without NaCl was used as a control. The membrane suspensions were incubated on ice for 30 
minutes and then centrifuged at 142 000 x g for 30 minutes. 

10 Extraction of cvtoskeletal and cvtoskeletal-associated proteins. 

To extract cytoskeletal proteins, crude membrane vesicles from the P3 fraction 
membrane were suspended in a volume of Triton X-100 extraction buffer to give a protein: 
detergent ratio of 5 : 1 . The composition of the Triton X-1 00 extraction buffer was based on 
the methods of Arai et aL, J. Neuroscience 38: 348-357 (1994) and contained 2% Triton 

15 X-100, lOmM Tris-HCl, 2mM MgCl2, ImM leupeptin, soybean trypsin inhibitor, PMSF and 
benzamidine. Membrane pellets were suspended by hand with a round-bottom teflon pestle, 
and placed on ice for 40 minutes. Insoluble cytoskeletal matrices were precipitated for 35 
minutes at 142 000 x g in a Beckman TLA 120.2 rotor. The supernatant was defined as 
non-cytoskeletal-associated membrane or membrane-associated protein and was removed. 

20 The remaining pellet was extracted with Triton X- 1 00 a second time using the same 

conditions. We defined the final pellet as cytoskeletal and cytoskeletal-associated protein. 

Solubilization of protein and analysis bv SDS-PAGE and Western Blotting 

Membrane and cytoskeletal protein was solubilized in a minimum volume of 1% 

25 SDS, 3M urea, O.lmM dithiothreitol in TBS buffer and sonicated. Protein concentration was 
determined using the BioRad DC Protein assay and samples were diluted at least 1 X with 5 
X sample buffer (250mM Tris-HCl pH 6.8, 10% SDS, 25% glycerol, 0.02% bromophenol 
blue and 7% 2-mercaptoethanol) and were loaded on 7.5% SDS-PAGE gels (Bio-Rad 
Mini-PROTEIN II Cell system) without boiling. Western blotting was performed as 

30 described above. 
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Immunohi stochemistrv 

Brain tissue was obtained from a normal C57BL/6 adult (6 months old) male mouse 
sacrificed with chloroform then perfusion-fixed with 4% v/v paraformaldehyde/0.01 M 
phosphate buffer (4% PFA). The brain tissues were removed, immersion fixed in 4% PFA 
5 for 1 day, washed in 0.0 IM phosphate buffered saline, pH 7.2 (PBS) for 2 days, and then 
equilibrated in 25% w/v sucrose PBS for 1 week. The samples were then snap-frozen in 
Tissue Tek molds by isopentane cooled in liquid nitrogen. After warming to -20 C, frozen 
blocks derived from frontal cortex, caudate/putamen, cerebellum and brainstem were cut into 
14 mm sections for immunohistochemistiy. Following washing in PBS, the tissue sections 

10 were blocked using 2.5% v/v normal goat scRim for 1 hour at room temperature. Primary 

antibodies diluted with PBS were applied to sections overnight at 4 C. Optimal dilutions for 
the polyclonal antibodies BKPl and HIPl were 1 :50. Using washes of 3 x 5 minutes in PBS 
at room temperature, sections were sequentially incubated with biotinylated secondary 
antibody and then an avidin-biotin complex reagent (Vecta Stain ABC Kit, Vector) for 60 

15 minutes each at room temperature. Color was developed using 3-3'-diaminobenzidine 
tetrahydrocholoride and ammonium nickel sulfate. 

For controls, sections were treated as described above except that HIPl antibody 
ahquots were preabsorbed with an excess of HIPl peptide as well as a peptide unrelated to 
HIPl prior to incubation with the tissue sections. 

20 

In situ hybridization 

In situ hybridization was performed as previously described with some modification 
(Suzuki et al, BBRC 219: 708-713 (1996)). The RNA probes were prepared using the 
plasmid gtl49 (Lin, B., et al., Human Molec, GeneL 2: 1541-1545 (1994)) or a 558 subclone 

25 of HIPl . The anti-sense and sense single-stranded RNA probes were synthesized using T3 
and T7 RNA polymerases and the In Vitro Transcription Kit (Clontech) with the addition of 
[a^^S]-CTP (Amersham) to the reaction mixture. Sense RNA probes were used as negative 
controls. For HIPl studies normal C57BL/6 mice were used. Huntingtin probes were tested 
on two different transgenic mouse strains expressing full-length huntingtin, cDNA HD10366 

30 (44CAG) C57BL/6 mice and YAC HD10366(1 8CAG) FVB/N mice. Frozen brain sections 
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(lOum thick) were placed onto silane-coated slides under RNase-free conditions. The 
hybridization solution contained 40% w/v formamide, 0.02M Tris-HCl (pH 8.0), 0.005M 
EDTA, 0.3 M NaCl, O.OIM sodium phosphate (pH 7.0), Ix Denhardf s solution, 10% w/v 
dextran sulfate (pH 7.0), 0.2% w/v sarcosyl, yeast tRNA (SOOmg/ml) and salmon sperm DNA 
(200mg/ml). The radiolabelled RNA probe was added to the hybridization solution to give 1 
X 106 cpm/200 ul/ section. Sections were covered with hybridization solution and incubated 
on formamide paper at 65 C for 1 8 hours. After hybridization, the slides were washed for 30 
minutes sequentially with 2x SSC, Ix SSC and high stringency wash solution (50% 
formamide, 2x SSC and 0.1 M dithiothreitol) at 65 C, followed by treatment with RNAse A 
(Img/ml) at 37 C for 30 minutes, then washed again and air-dried. The sHdes were first 
exposed on autoradiographic film (b-max, Amersham, UK) for 48 hours and developed for 4 
minutes in Kodak D-19 followed by a 5 minute fixation in Fuji-fix. For longer exposures, the 
slides were dipped in autoradiographic emulsion (50% w/v in distilled water, NR-2, Konica, 
Japan), air-dried and exposed for 20 days at 4 C then developed as described. Sections were 
counterstained with methyl green or Giemsa solutions. 

EXAMPLE 10 

We determined a more precise location of the HIPl gene on chromosome 7 in the 
context of a physical and genetic map of chromosome 7, and determined its genomic 
organization. HIPl maps by FISH and RH mapping to chromosome band 7ql 1 .23, which 
contains the chromosomal region commonly deleted in Williams-Beuren syndrome (WS). 
We used several methods to refine the mapping of HIPl in this region. PGR screening of a 
chromosome 7-YAC-library (Scherer et al., mammalian Genome 3: 179-181 (1992)) with 
primers fi-om the 3' UTR of HIPl resulted in the identification of only a single positive YAC 
clone (HSC7E512). This YAC clone had previously been shown to map near the Williams 
syndrome commonly deleted region (Osborne et al., Genomics 45: 402-406 (1997)). The 
HIPl cDNA was then used to screen a chromosome 7 specific cosmid library from the 
Lawrence Livermore National Laboratory (LL07NC01), and the RPCI genomic PI derived 
artificial chromosome (PAC) library (Pieter de Jong, Rosswell Park, Buffalo, NY). Several 
PAC and cosmid clones that were already part of pre-assembled contigs in the Williams 
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syndrome region at 7ql 1.23 were identified (Fig 5). Restriction enzyme digestion, blot 
hybridization experiments and PGR screening confirmed that the clones contained the HIPl 
gene. 

We determined the exon-intron boundaries and intron sizes of HIPl . Primers were 
designed based on the sequence of the HIPl transcript and used to sequence directly from the 
cosmid, PAC clone and long PGR products from PAG or genomic DNA. Whenever a PGR 
fragment generated was longer than predicted from the cDNA sequence, it was assumed to 
contain an intron. The size of the introns was determined by sequencing the intron directly 
or by PGR ampUfication of the introns from both genomic DNA and the cosmid or PAC 
clone from the region. Three sets of overlapping cosmids and a PAG clone that contain the 
entire coding sequence of HIPl were characterized (Fig 5). Gosmid 181G10 and 250F2 
were digested with EcoRI and cloned into the plasmid bluescript. Further sequences were 
generated from these plasmid subclones. Intron-exon boundary sequences were then 
identified by comparing HIPl genomic and transcript sequence. The gene is contained within 
75 kb and comprises 29 exons and 28 introns. The intron-exon boundary sequences are 
shown in Table 4, along with the exon and intron sizes. A graphic summary of these data is 
also shown in Fig. 5. Exons 1 to 28 contained the coding regions. The last and largest 
exon of the HIPl gene was found to contain approximately 7 kb. Most of the intron-exon 
junctions followed the canonical GT-AG rule. An AT was found at the 3' splice site of exon 
1 and an AG at the 5' splice site of exon 2. Sequence data from all the exon-intron borders 
of the coding region and 3'-UTR is set forth in Seq. ID Nos. 1 6-44. (These sequence have 
been deposited with GenBank as Accession Nos. AF052261 to AF052288). 

Sequence analysis of previously published 5' untranslated region (GenBank accession 
U79734) revealed the possibility that the open reading frame extends upstream of the ATG in 
the exon 4 to a 5' ATG in exon 1 . Although we failed to obtain any additional 5' sequences 
despite repeated 5' RAGE analyses, an additional ATG, 284 bp upstream of the previously 
published exon 1 is in the same reading frame and has the surrounding sequence of 
TGGGATGTT which is similar to the AGGGATGGG, the consensus Kozak sequence 
(Kozak, M. Nucl. Acids Res. 15: 8125-8148 (1987)). If translated from this ATG, the protein 
would be highly homologous to the N-terminal portion of ZK370.3 and yeast Sla2 protein 
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(Fig. 6). The translated protein in the region of exons 1 to 3 shows an identity of >40% and 
similarity of >60% to the N-terminal part of ZK370.3. This suggests that the exons 1 to 3 
are probably translated. 

In western blot studies, HIPl is identified as a 120 kd protein (11, 23), while the 
5 putative translation of the previously published cDNA gives a protein product of estimated 

molecular weight of approximately 1 00 kd. If HIPl gene were translated from the ATG 284 
bp upstream of the exon 1, the expected product would have an estimated molecular weight 
of 122 kd. RNA PCR studies with primers downstream of this ATG and primers in exon 7 
amplify expected products of 576 and 600 bp. Taken together these data support the 

10 contention that exon 1 extends further 5' and that HIPl gene is translated from the ATG in 

exon 1. Sequence analyses showed no TATA, CAAT box or any GC rich promoter sequence 
upstream of exon 1 ATG. The promoter prediction programs provided by the server 
http://dot.imgen.bcm.tmc.edu: 9331/seq.search/gene.search.html did not predict any promoter 
upstream of the ATG at position -284, (position 0 corresponds to the first nucleotide of 

15 pubUshed cDNA, GenBank accession U79734). This suggests that HIPl may have additional 
exons. 

Finally, we evaluated HIPl gene as a candidate gene for Huntington disease in 
families without CAG expansion. In a large study of 1 022 patients with a clinical diagnosis 
of HD, no CAG repeat expansion was found in 12 patients who might represent phenocopies 

20 of HD. In at least three families, linkage studies have excluded the HD locus at 4p. 

Mutation in an interacting protein could result in a similar phenotype as illustrated by the 
discovery of mutations in dystrophin associated proteins in muscular dystrophies. A 
mutation in HIPl may resuU in altered interaction of huntingtin and HIPl and lead to cellular 
toxicity as a result of more HIPl being free in the cytosol. Thus mutations in huntingtin 

25 interacting proteins genes may cause a phenotype suggestive of HD. We studied two of the 
larger families diagnosed with HD without CAG expansion in HD gene, with the highly 
informative marker D7 1 8 1 6 which maps centromeric and very close to HIP 1 gene. The 
clinical findings in both the families were compatible with a diagnosis of HD, although there 
were atypical features. In family 1733, HIPl locus appears to be excluded, as there are two 

30 recombinants with the marker. Individuals II-5 and 11-7 who do not share the haplotype with 
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the affected individuals are now 41 and 39 years old and have normal neurological 
examinations. 

In the family 1602, a lod score of 1.92 is obtained with the marker D7S1816 at e^^.^O. 
Sequencing of all the coding exons did not reveal any mutation in any exon sequence. The 
5 promoter sequence has not been examined. Subsequently a whole genome scan revealed a 
higher lod scores for markers on chromosome 20p. 

EXAMPLE 11 

A mouse brain lambda ZAPII cDNA library (Stratagene # 93609) was screened with 
10 various mouse ESTs which showed homology to the human HIP 1 cDNA sequence (see Fig. 
7). The ESTs were initially isolated from the non-redundant Database of GenBank EST 
Division by performing a BLASTN using a fragment of the human HIPl cDNA as the query. 
We obtained 4 different ESTs which showed homology to HIPl : 1) aal 10840 (clone 520282) 
which is 399bp and shows 58% identity, at the nucleotide level, to position 1880 to 2259 of 
15 the HIPl cDNA. 2) w82687 (clone 404331) which is 420bp and shows 66% identity, at the 
nucleotide level, to position 2750 to 2915 of the HIPl cDNA. 3) aal38903 (clone 586510) 
which is 509bp and shows 88%> identity, at the nucleotide level, to position 2763 to 2832 of 
the HIPl cDNA. 4) aa388714 (569088) which is 404bp and shows 88% identity, at the 
nucleotide level, to position 2475 to 2692 of the HIPl cDNA. 

20 

mHIPl : 

Fifty nanograms of a 362bp Kpnl & PvuII fragment of clone 569088 (containing EST 
aa388714) was radioactively labeled with [32-P]-dCTP using random-priming. The probe 
was allowed to hybridize to filters containing > 2x 10^ pfu/ml of the mouse brain lambda 

25 ZAPII cDNA library (Stratagene # 93609) overnight at 65 X in Church buffer (0.5M sodium 
phosphate buffer (pH 7.2), 2.7% SDS, ImM EDTA). The filters were washed at room 
temperature for 15 minutes with 2XSSPE, 0.1% SDS, then at 65°C for 20 minutes with 
IXSSPE, 0.1%SDS and finally twice at 65°C with 0.5 XSSPE, 0.1%SDS. The filters were 
exposed to X-ray film (Kodak, XAR5) overnight at -70 C. Primary positives were isolated, 

30 replated and subsequent secondary positives were hybridized and washed as for the primaiy 
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screen. The resulting positive phage was converted into plasmid DNA by conventional 
methods (Stratagene) and the cDNA termed 4n-nl, was isolated and sequenced 551 bp and 
541bp from the T7 and T3 end, respectively. 4n-nl is 2.2kb in length and the T7 end showed 
72% identity, at the nucleotide level, to position 1486 to 1715 of the HIPl cDNA. The 2.2kb 
5 insert from 4n-nl was excised using EcoRl . Fifty nanograms of the 2,2kb insert was used to 
produced a radioactive probe and used to screen the mouse brain lambda ZAPII cDNA library 
(Stratagene # 93609) in the same manner as above. The resulting positive phage was 
converted into plasmid DNA by conventional methods (Stratagene) and the cDNA termed 
mHIPla, was isolated and completely sequenced. mHIPl is 2.3kb in length and showed 85% 
10 identity, at the nucleotide level, to position 726 to 3072 of the HIPl cDNA, 

mHIPl a : 

Fifty nanograms of a 1.3kb EcoRI & Ncol fragment of clone 404331 (containing EST 
w82687) was radioactively labeled with [32-P]-dCTP using random— priming. The probe was 

1 5 allowed to hybridize to filters containing > 2x 1 0^ pfu/ml of the mouse brain lambda ZAPII 
cDNA library (Stratagene # 93609) ovemight at 65 °C in Church buffer (see above). The 
filters were washed at room temperature for 15 minutes with 2XSSPE, 0.1% SDS, then at 
65 ''C for 20 minutes with IXSSPE, 0.1%SDS and finally twice at 65 X with 0.2XSSPE, 
0.1%SDS. The filters were exposed to X-ray film (Kodak, XAR5) ovemight at -70''C. 

20 Primary positives were isolated, replated and subsequent secondary positives were 

hybridized and washed as for the primary screen. The resulting positive phage was converted 
into plasmid DNA by conventional methods (Stratagene) and the cDNA termed mHIPl a, was 
isolated and completely sequenced. mHIPl a is 3.96 kb in length and shows 60% identity, at 
the nucleotide level, to position 12 to 2703 of the HIPl cDNA. 

25 

EXAMPLE 12 

HIPl a : 

The entire mHIPla cDNA sequence was used to screen the non-redundant Database 
of GenBank EST Division. We identified a human EST, T08283, which showed homology to 
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mHIPla. T08383 (clone HIBBB80) is 391bp and shows 87% identity, at the nucleotide level, 
to position 2904 to 3113 of the mHIPla cDN A. 

Fifty nanograms of a 1.6kb Hindllll & Not II fragment of clone 404331 (containing 
EST T08283) was radioactively labeled with [32-P]-dCTP using random-priming. The probe 
5 was allowed to hybridize to filters containing > 2x 1 05 pfu/ml of a human frontal cortex 
lambda cDNA library ovemight at 65 C in Church buffer (see above). The filters were 
washed at 65 C for 10 minutes with IXSSPE, 0.1% SDS, and then for 30 minutes and 15 
minutes with O.IXSSPE, 0.1%SDS. The filters were exposed to X-ray film (Kodak, XAR5) 
ovemight at -70 C, Primary positives were isolated, replated and subsequent secondary 
10 positives were hybridized and washed as for the primary screen. The resulting positive phage 
was converted into plasmid DNA by conventional methods (Stratagene) and the cDNA 
termed HIP la, was isolated and completely sequenced. HIP la is 3.2 kb in length and shows 
53% identity, at the nucleotide level, to position 876 to 3058 of the HIPl cDNA. 

15 EXAMPLE 13 

Following the identification of a 1 .2 kb partial human HIP-1 cDNA by yeast 
two-hybrid interaction studies, a 3.9 kb HIP-1 fragment was isolated from a cDNA Hbrary, 
ligated to a 5* RACE product then subcloned into the mammalian expression vector pCI-neo 
(Promega). This construct, CMV-HIP-1, expresses HIP-1 from the CMV promoter and was 

20 used in the cell expression studies described below. Mouse HIP- la (mHIP-la) was also 
subcloned into a CMV driven expression vector for cell culture expression studies. 



EXAMPLE 14 

Huntingtin proteins with expanded polyglutamine tracts can aggregate into large, 
25 irregularly shaped deposits in HD brains, transgenic mice and in vitro cell culture. We have 
shown that in HEK (human embryonic kidney) 293T cells the aggregation of full-length and 
larger huntingtin fragments occurs after the cells have been exposed to a period of apoptotic 
stress. In order to assess the consequence of HIP-1 expression in cultured cells, we used 
huntingtin aggregation as one marker of viability. 
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Human embryonic kidney cells (HEK 293T) were grown on glass coverslips in 
Dulbecco*s modified Eagle medium (DMEM, Gibco, NY) with 10% fetal bovine serum and 
antibiotics, in 5% C02 at 37°C. The cells were transfected at 30% confluency with the 
calcium phosphate protocol by mixing Qiagen-prepared DNA (Qiagen, CA) with 2.5 M 

5 CaClz, then incubatmg at room temperature for 10 min. 2X HEPES buffer (240 mM NaCl, 
3.0 mM Na2HP04, 100 niM HEPES, pH 7.05) was added to the DNA/calcium mixture, 
incubated at 37 °C for 60 sec, then added to the cells. After 12-18 h, the media was removed, 
the cells were washed and fresh media was added. At 36 h post-transfection, the cells were 
exposed to an apoptotic stress by treatment with 35 uM tamoxifen (Sigma) for 1 hour, or left 

10 untreated, then processed for immunofluorescence. The cells were washed with PBS, fixed in 
4% parafomialdehyde/PBS solution for 20 minutes at room temperature then permeabilized 
in 0.5% Triton X-IOO/PBS for 5 min. Following three PBS washes, the cells were incubated 
with anti-huntingtin antibody MAB2166 (Chemicon) (1:2500 dilution) and anti-HIP-1 
antibody HIP-lfp (1:100 dilution) in 0.4% BS A/PBS for 1 h at room temperature in a 

15 humidified container. The primary antibody was removed, the cells were washed and 

secondaiy antibodies conjugated to Texas red or FITC were added at a 1 :600-l:800 dilution 
for 30 min at room temperature. The cells were then washed again, and the coverslips were 
mounted onto slides with DAPI (4',6'-diamindino-2 phenylindole, Sigma) as a nuclear 
counter-stain. Immunofluorescence was viewed using a Zeiss (Axioscope) microscope, 

20 digitally captured with a CCD camera (Princeton Instrument Inc.) and the images were 
colourized and overlapped using the Eclipse (Empix Imaging Inc.) software program. 
Appropriate control experiments were performed to determine the specificity of the 
antibodies, including secondary antibody only and mock transfected cells. 

The huntingtin fragment HD1955 was used in the aggregation studies. This fragment 

25 represents the N-terminal 548 amino acids of huntingtin, and corresponds approximately to 
the polyglutamine-containing fragment produced by caspase 3 cleavage of huntingtin. 
Transfection of HD1955 with 15 polyglutamines (HD1955-15) results in a diffuse 
cytoplasmic distribution of the expressed protein. Transfection of HD1955 with 128 
polyglutamines (HDl 955-1 28) also results in diffuse cytoplasmic expression. However, 

30 exposure of cells transfected with HD1955-128 to tamoxifen results in a marked 
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redistribution of huntingtin. In 29% of cells expressing HDl 955-128, the huntingtin protein 
appears as dense aggregates that are localized in the perinuclear area of the cell. In contrast, 
less than 1% of HDl 95 5- 128 expressing cells contain aggregates in the absence of tamoxifen, 
and 0% of HDl 955- 15 cells form aggregates in the presence or absence of tamoxifen 
5 treatment. 

Co-transfection of HIP- 1 and HDl 95 5 was used to test the influence of HIP- 1 on 
huntingtin aggregation. As a control, b-galactosidase was co-transfected with HDl 95 5. In the 
control transfections, 1-2% of cells expressing HDl 955- 128 formed aggregates in the absence 
of tamoxifen, similar to HD1955-128 expressed alone. However, when HD1955-128 was 

10 co-expressed with HIP-1, an average of 14% of huntingtin-expressing cells contained 

aggregates with no tamoxifen treatment. Double-labeling demonstrated that the majority of 
the cells containing aggregates also expressed HIP-1, directly implicating HIP-1 in the 
increase in aggregation. Therefore, these results indicate that HIP-1 provides sufficient stress 
on the huntingtin-expressing cells to form aggregates, to the extent that tamoxifen is no 

1 5 longer necessary. 

EXAMPLE 15 

We next designed a series of experiments to identify a region of HIP-1 sufficient for 
inducing aggregate formation of HD1955-128. As described above, HIP-1 contains a domain 

20 with high homology to the death effector domains (DED) present in many apoptosis-related 
proteins. The DED domain of HIP-1 was ligated in-frame to HD1955-128, 3' from the 
caspase-3 cleavage site. Transfection of the resulting fusion protein with the DED ligated in 
the sense orientation (HD1955-128-DEDsense) resulted in a large number (30-50%) of cells 
containing aggregates, without tamoxifen incubation. In contrast, expression of a 

25 huntingtin-DED fusion protein with DED in the antisense orientation 

(HD1955-128-DEDanlisense) did not have more aggregates than the HD1955-128 no 
tamoxifen control. Therefore, the DED domain of HIP-1 is sufficient to stress the cells, 
causing aggregate formation. 
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EXAMPLE 16 

To directly assess the effect of HIP- 1 expression on cell viability, mitochondrial 
function tests were performed on 293T cells transfected with HIP-1. The assessment of 
mitochondrial function, using the MTT assay (Camiichael et al.. Cancer Res. 41: 936-942 
5 (1987); Vistica et al.. Cancer Res, 51: 2515-2520 (1991)), is a standard method to measure 
cell viability. The MTT assay quantitates the formation of a coloured substrate (formazan), 
with the mitochondria of viable cells forming more substrate than non-viable cells. Since 
decreased mitochondrial activity is an early consequence of many cellular toxins, the MTT 
assay provides an early indicator of cell damage. 

10 For cell viability assays, HEK 293T cells were seeded at a density of 5 x 10** cells into 

96-well plates and transfected with 0.1 ug or 0.08 ug HIP-1 or 0.1 ug of the control construct 
lacZ using the calcium phosphate method described above. At 24-36 hours post-transfection 
tamoxifen-treated cells were incubated for 2 hours in a 1:10 dilution of WST-1 reagent 
(Boehringer Mannheim) and release of formazan from mitochondria was quantified at 450 

15 nm using an ELISA plate reader (Dynatech Laboratories) (Carmichael et al., 1987; Mosmann, 
/ Immunol, Meth 65: 55-63 (1983)). One way ANOVA and Newman-Keuls test were used 
for statistical analysis. The transfection efficiency, measured by p-galactosidase staining and 
immunofluoresence, was approximately 50%. 

We have previously demonstrated that expression of mutant huntingtin results in 

20 increased susceptibility to an apoptotic stress induced by sub-lethal doses of tamoxifen in 
transfected 293T cells (Martindale et al,, 1998). A similar assay was used to test the 
consequence of HIP-1 expression. With 0.1 ug transfected HIP-1 DNA, after 24 hr 
expression, HIP-1 resulted in increased cell death in response to tamoxifen, compared with 
the tamoxifen-treated p-galactosidase control (p<0.01, n=4). Reducing the amount of 

25 transfected HIP-1 DNA to 0.08 ug also resulted in increased cell death compared with control 
(p<0.01, n=4), indicating the high potency of HIP-1 (Fig. 8). Furthermore, increased cell 
death in cells transfected with HIP-1 was observed in the absence of apoptotic stress at 48 hrs 
post-transfection, but was so severe that is could not be accurately quantitated. Thus, an 
earlier time point (24 hr) had to be used for better reproducibility, using an apoptotic stress to 

30 unmask the phenotype. 
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In order to model a pathogenic interaction of HIP- 1 and huntingtin in the HEK 293 
mammaUan cell system, HIP-1 was transfected into cell lines stably expressing huntingtin. 
Two cell lines were chosen for the initial studies, one line expressed the truncated HD1955 
construct with 15 glutamines, and the second expressed the HD1955 with 128 repeats. 
Western blotting indicated that the cell lines expressed huntingtin at similar levels. To assess 
whether HIP-1 is toxic in the presence of mutant huntingtin, 0.1 ug HIP-1 DNA was 
transfected into the HD1955-128 cell line. Transfection of HIP-1 into the HD1955-15 cell 
line was used as the wild-type huntingtin control, and transfection of LacZ into both cell lines 
was the control for transfection-related toxicity (Figs 9A and 9B). MTT toxicity assays 
showed that HIP-1 in the presence of mutant huntingtin (HD 1955- 128) was significantly 
more toxic than HIP-1 with wild-type huntingtin (HD1955-15), p<0.001, n=4 (Fig. 9C). This 
toxicity was observed at 24 hr and 36 hr post-transfection. No tamoxifen was needed to 
unmask the phenotype, suggesting that the combined cell stress of HIP-1 with truncated 
huntingtin was sufficient to reduce cell viability over control. 



wo 99/60986 



-32- 



PCTAJS99/11743 



CLAIMS 



1 LA polypeptide comprising the sequence given by Seq. ID, No. 5. 

1 2. A cDNA molecule comprising the sequence given by Seq. ID No, 6. 

1 3. A polypeptide comprising the sequence given by Seq. ID No. 7. 

1 4. A method for ameliorating the effects of Huntington's disease in a 

2 patient expressing a HIP-apoptosis modulating protein, comprising the step of administering 

3 the patient a therapeutic composition which reduces the activity of the HIP-apoptosis 

4 modulating protein. 

1 5. A method according to claim 4, wherein the composition comprises a 

2 material which binds to HIP-apoptosis modulating protein. 

1 6. The method according to claim 4, wherein the composition comprises 

2 an expression vector encoding huntingtin having a normal number of repeats. 

1 7. An expression vector for expression of a gene in a mammalian host 

2 comprising a region encoding an HD-interacting polypeptide. 

1 8. The expression vector according to claim 7, wherein the HD- 

2 interacting polypeptide is an HIP-apoptosis modulating protein. 

1 9. The expression vector according to claim 8, wherein the HIP-apoptosis 

2 modulating protein has a sequence which includes the amino acid sequences given by SEQ 

3 ID Nos. 2, 4, 5 or 7. 
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1 10. The expression vector of claim 7, wherein the HD-interacting 

2 polypeptide interacts differently with expanded Huntingtin than with Huntingtin having a 

3 CAG repeat region containing 15 to 35 repeats. 

1 11. The expression vector according to claims of claims 7-10, further 

2 comprising a region encoding Huntingtin having a polyglutamine tract of 35 or fewer, 

1 12. A method for inducing apoptotic death in cells, comprising the step of 

2 introducing into the cells an expression vector encoding at least the death effector domain of 

3 a HIP-apoptosis modulating protein whereby the death effector domain is expressed by the 

4 cells. 

1 13. The method of claim 12, wherein the expression vector encodes the 

2 amino acid sequence given by Seq. ID. No. 2. 

1 14, The method of claim 12, wherein the expression vector encodes the 

2 amino acid sequence given by Seq. ID. No. 4. 

1 15. A method for screening a composition for the ability to inhibit 

2 apoptosis induced by an HIP-apoptosis modulating protein, comprising simultaneously 

3 exposing a population of cells to the composition and an HIP-apoptosis modulating protein 

4 and measuring the extent of cell death. 
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>Osurpin A ^ 
SAEVIHQVEEALDTDEKKMLLFLCRDVAIDWPPNVRDLLDILRERGKLSVCDLAELLYRVHRFDLLKRILK 

>U3urpin: B 

yRVLMAHIGEDLDKi5DV5SLIFLMKDYMGRGKISKHKSFLDLVVELHKL>3LVAPDQLDLLEKCLKNIHRIDL^ 
>Casp-8 A 

FSRNLyDIGELQDSEDL.'VSLKELSLDYIPQRKOEPIKDALMIFQRLOEKRI'lLEESNLSFLKELLFRINRLDLLXTYLN 
>Casp-0 B 

yRVMLYQISEEVSREELRSFKrLLQHEISKCKLDDDMNLLDIFIEH£i<RVILGEGKLDILKRVCAQINKSLLKIND 
>Casp-10 A 

FRHKLLTIDSNLGVQDVENLKFLCIGLVPNKKLEKSSSASDVFEiriiLAHDLLSEEDPFFLAELLYIIRQKKLLQHLNC 
>Ca3D-10i B 

FRNLLYELBEGIDSENLKDMIFLLKDSLPKTEMTSLSFLAFLEKQGKIDEDMLTCLEDLCKTWPKLLRNIEK 
>FADD 

FLVLLHSVSSSLSSSELTELKFLCLGRVGKKKLERVQSGLDLFSMLLEQNDLEPGHTELLRELLASLRRHDLLRRVDD 
>MC159 A 

SLPFLRHLLEELDSHSDSLLLFLGHDAAPGCTTVTQALCSLSQQRKLTU^LVEMLYVLQP^JDLLKSRFG 
>MC159 B 

YHKLMVCVGEELDSSELRALRLFACNLNPSLSTALSESSRPVELVLALENVGLVSPSSVSVLADMLRTLRRLDLCQQLVE 

FRCLMALVhlDFLSDKEVEEHYFLCAPRLESHLEPGSKKSFLRLASLLEDLELLGGDKLTFLRHLLTTIGR^ 
>K9 orfkl3A 

TYEVLCEVARKLGTDDRSWLFLLNVFLPQPTLAQLIGALRALKEEGRLTFPLLAECLPRAGRRDLLRDLLH 
>KS orfklSB 

YQLTVXHVDGELCARDIRSLIFLSKDTIGSRSTPQTFLHNVYCMENLDLLGPTDVDALMSbJLRSLSRVDLQRQVQT 

SELEADLAEQQHLRQQAADDCEFLRAELDELRRQREDTEKAORSLSEIERKAQANEQRYSKLKEKYSELVQNHADLLRFO^ 
AE 

>HIPla 

GELEEQRKQKQKALVDNEQLRHELAQLRAAOLERERSQGLREEAERKASATEARYNKLKEKHSELVHVIIAELLRKNAD 
>inHIPla 

NGLEAELEEQRKQKQKALVDNEOLRHELAQLKALOLEGARNQGLREEAERKASATEARYSKLKEKHSELINTHAELLRKl^ 
AD 

>mHIPl 

SELEAELAtQQHLGRQAMUDCEFLRTELDELKRQREDTEI<AQRSLTEI£RKAOANEQRYSKLKEKYSELVQNHADLLRI(N 
AE 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

(1) APPLICANT: Kalchman, Michael 
Hayden. Michael R. 

Hackam, Abigail 
Chopra, Vikramjit Singh 
Nicholson, Donald W. 
Vallaincourt, John P. 
Rasper, Dita M. 

(ii) TITLE OF INVENTION: Apoptosis Modulators That Interact with the 
Huntington's Disease Gene 

(iii) NUMBER OF SEQUENCES: 44 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Oppedahl & Larson 

(B) STREET: PO Box 5270 

(C) CITY: Frisco 

(D) STATE: CO 

(E) COUNTRY: USA 

(F) ZIP: 80443-5270 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette, 3.50 inch, 1.44 Kb storage 

(B) COMPUTER: IBM Compatible 

(C) OPERATING SYSTEM: MS DOS 5.0 

(D) SOFTWARE: WordPerfect 

(vi) CURRENT APPUCATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Larson, Marina T. 

(B) REGISTRATION NUMBER: 32038 

(C) REFERENCE/DOCKET NUMBER: UBC.P-013US2 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (970) 668-2050 

(B) TELEFAX: (970) 668-2052 

(2) INFORMATION FOR SEQ ID NO: 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1164 

(B) TYPE: nucleic acid 
(C^-STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
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(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(ix) FEATURE: cDNA for Huntingtin-interacting protein 
(xi)SEQUENCE DESCRIPTION: SEQ ID NO:l: 

ACAGCTGACA CCCTGCAAGG CCACCGGGAC CGCTTCATGG AGCAGTTTAC 50 
AAAGTTGAAA GATCTGTTCT ACCGCTCCAG CAACCTGCAG TACTTCAAGC 100 
GGGTCATTCA GATCCCCCAG CTGCCTGAGA ACCCACCCAA CTTCCTGCGA 150 
GCCTCAGCCC TGTCAGAACA TATCAGCCCT GTGGTGGTGA TCCCTGCAGA 2 00 
GGCCTCATCC CCCGACAGCG AGCCAGTCCT AGAGAAGGAT GACCTCATGG 250 
ACATGGATGC CTCTCAGCAG AATTTATTTG ACAACAAGTT TGATGACNTC 300 
TTTGGCAGTT CATCCAGCAG TGATCCCTTC AATTTCAACA GTCAAAATGG 350 
TGTGAACAAG GATGAGAAGG ACCACTTAAT TGAGCGACTA TACAGAGAGA 400 
TCAGTGGATT GAAGGCACAG CTAGAAAACA TGAAGACTGA GAGCCAGCGG 450 
GTTGTGCTGC AGCTGAAGGG CCACGTCAGC GAGCTGGAAG CAGATCTGGC 500 
CGAGCAGCAG CACCTGCGGC AGCAGGCGGC CGACGACTGT GAATTCCTGC 550 
GGGCAGAACT GGACGAGCTC AGGNGGCAGC GGGAGGACAC CGAGAAGGCT 600 
CAGCGGAGCC TGTCTGAGAT AGAAAGGAAA GCTCAAGCCA ATGAACAGCG 650 
ATATAGCAAG CTAAAGGAGA AGTACAGCGA GCTGGTTCAG AACCACGCTG 7 00 
ACCTGCTGCG GAAGAATGCA GAGGTGACCA AACAGGTGTC CATGGCCAGA 7 50 
CAAGCCCAGG TAGATTTGGA ACGAGAGAAA AAAGAGCTGG AGGATTCGTT 800 
GGAGCGCATC AGTGACCAGG GCCAGCGGAA GACTCAAGAA GAGCTGGAAG 850 
TTCTAGAGAG CTTGAAGCAG GAACTTGGCA CAAGCCAACG GGAGCTTCAG 900 
GTTCTGCAAG GCAGCCTGGA AACTTCTGCC CAGTCAGAAG CAAACTGGGC 950 
AGCCGAGTTC GCCGAGCTAG AGAAGGAGCG GGACAGCCTG GTGAGTGGCG 1000 
CAGCTCATAG GGAGGAGGAA TTATCTGCTC TTCGGAAAGA ACTGCAGGAC 1050 
ACTCAGCTCA AACTGGCCAG CACAGAGGAA TCTATGTGCC AGCTTGCCAA 1100 
AGACCAACGA AAAATGCTTC TGGTGGGGTC CAGGAAGGCT GCGGAGCAGG 1150 
TGATACAAGA CGCG 1164 

(2) INFORMATION FOR SEQ ID NO:2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 386 

(B) TYPE: protein 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(ix) FEATURE: Huntingtin-interacting protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: 

Thr Ala Asp Thr Leu Gin Gly His Arg Asp Arg Phe Met Glu Gin 
15 10 15 

Phe Thr Lys Leu Lys Asp Leu Phe Tyr Arg Ser Ser Asn Leu Gin 



20 



25 



30 



Tyr Phe Lys Arg Val lie Gin lie Pro Gin Leu Pro Glu Asn Pro 
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35 



45 



Pro Asn Phe Leu Arg Ala Ser Ala Leu Ser Glu His lie Ser Pro 

50 55 60 

Val Val Val lie Pro Ala Glu Ala Ser Ser Pro Asp Ser Glu Pro 

65 70 75 

Val Leu Glu Lys Asp Asp Leu Met Asp Met Asp Ala Ser Gin Gin 

80 85 90 

Asn Leu Phe Asp Asn Lys Phe Asp Asp Phe Gly Ser Ser Ser Ser 

95 100 105 

Ser Asp Pro Phe Asn Phe Asn Ser Gin Asn Gly Val Asn Lys Asp 

110 115 120 

Glu Lys Asp His Leu lie Glu Arg Leu Tyr Arg Glu lie Ser Gly 

125 130 135 

Leu Lys Ala Gin Leu Glu Asn Met Lys Thr Glu Ser Gin Arg Val 

140 145 150 

Val Leu Gin Leu Lys Gly His Val Ser Glu Leu Glu Ala Asp Leu 

155 160 165 

Ala Glu Gin Gin His Leu Arg Gin Gin Ala Ala Asp Asp Cys Glu 

170 175 180 

Phe Leu Arg Ala Glu Leu Asp Glu Leu Arg Gin Arg Glu Asp Thr 

185 190 195 

Glu Lys Ala Gin Arg Ser Leu Ser Glu lie Glu Arg Lys Ala Gin 

200 205 210 

Ala Asn Glu Gin Arg Tyr Ser Lys Leu Lys Glu Lys Tyr Ser Glu 

215 220 225 

Leu Val Gin Asn His Ala Asp Leu Leu Arg Lys Asn Ala Glu Val 

230 235 240 

Thr Lys Gin Val Ser Met Ala Arg Gin Ala Gin Val Asp Leu Glu 

245 250 255 

Arg Glu Lys Lys Glu Leu Glu Asp Ser Leu Glu Arg lie Ser Asp 

260 265 270 

Gin Gly Gin Arg Lys Thr Gin Glu Gin Leu Glu Val Leu Glu Ser 



275 



280 



285 



Leu Lys Gin Glu Leu Gly Thr Ser Gin Arg Glu Leu Gin Val Leu 
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300 



Gin Gly Ser Leu Glu Thr Ser Ala Gin Ser Glu Ala Asn Trp Ala 

305 310 315 

Ala Glu Phe Ala Glu Leu Glu Lys Glu Arg Asp Ser Leu Val Ser 

320 325 330 

Gly Ala Ala His Arg Glu Glu Glu Leu Ser Ala Leu Arg Lys Glu 

335 340 345 

Leu Gin Asp Thr Gin Leu Lys Leu Ala Ser Thr Glu Glu Ser Met 

350 355 360 

Cys Gin Leu Ala Lys Asp Gin Arg Lys Met Leu Leu Val Gly Ser 

365 370 375 

Arg Lys Ala Ala Glu Gin Val lie Gin Asp Ala 

380 385 386 



(2) INFORMATION FOR SEQ ID NO:3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4796 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(ix) FEATURE: cDNA for Huntingtin-interacting protein 
(xi)SEQUENCE DESCRIPTION: SEQ DD NO: 3: 

CAGTGTACGG TTGATCATAT AACGCCGCGG GCGGGGATTG GTTTATATAT 50 
CGCAAATTGA TNTAGGGGGG GGGGGATGGN CAGAGATTTC GCTTCATTAG 100 
GCCATTATAA GCAGGAAGGG TTTCAAGGAA AAAAACCCAG AAAGTGCATA 150 
TTGCACCCAC CATGAGAAAG GGGCAACAGA CCTTNTGTTN TGTTNTCAAC 200 
CGCCTGCTTC TGTTTTAGCA ACGCAGTGTT TTGGTGGAAG TTGTGCCATG 250 
TGTTCCACAA ANTCTTCCGA GATGGACACC CGAACGTCCT GAAGGACTTT 300 
GTGAGATACA GAAATGAATT GAGTGACATG AGCAGGATGT GGGGCCACCT 3 50 
GAGCGAGGGG TATGGCCAGC TGTGCAGCAT CTACCTGAAA CTGCTAAGAA 400 
CCAAGATGGA GTACCACACC AAAAATCCCA GGTTCCCAGG CAACCTGCAG 450 
ATGAGTGACC GCCAGCTGGA CGAGGCTGGA GAAAGTGACG TGAACAACTT 500 
TTTCCAGTTA ACAGTGGAGA TGTTTGACTA CCTGGAGTGT GAACTCAACC 550 
TCTTCCAAAC AGTATTCAAC TCCCTGGACA TGTCCCGCTC TGTGTCCGTG 600 
ACGGCAGCAG GGCAGTGCCG CCTCGCCCCG CTGATCCAGG TCATCTTGGA 650 
CTGCAGCCAC CTTTATGACT ACACTGTCAA GCTTCTCTTC AAACTCCACT 7 00 
CCTGCCTCCC AGCTGACACC CTGCAAGGCC ACCGGGACCG CTTCATGGAG 750 
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CAGTTTACAA AGTTGAAAGA TCTGTTCTAC CGCTCCAGCA ACCTGCAGTA 800 

CTTCAAGCGG CTCATTCAGA TCCCCCAGCT GCCTGAGAAC CCACCCAACT 850 

TCCTGCGAGC CTCAGCCCTG TCAGAACATA TCAGCCCTGT GGTGGTGATC 900 

CCTGCAGAGG CCTCATCCCC CGACAGCGAG CCAGTCCTAG AGAAGGATGA 950 

CCTCATGGAC ATGGATGCCT CTCAGCAGAA TTTATTTGAC AACAAGTTTG 1000 

ATGACATCTT TGGCAGTTCA TTCAGCAGTG ATCCCTTCAA TTTCAACAGT 1050 

CAAAATGGTG TGAACAAGGA TGAGAAGGAC CACTTAATTG AGCGACTATA 1100 

CAGAGAGATC AGTGGATTGA AGGCACAGCT AGAAAACATG AAGACTGAGA 1150 

GCCAGCGGGT TGTGCTGCAG CTGAAGGGCC ACGTCAGCGA GCTGGAAGCA 1200 

GATCTGGCCG AGCAGCAGCA CCTGCGGCAG CAGGCGGCCG ACGACTGTGA 12 50 

ATTCCTGCGG GCAGAACTGG ACGAGCTCAG GAGGCAGCGG GAGGACACCG 1300 

AGAAGGCTCA GCGGAGCCTG TCTGAGATAG AAAGGAAAGC TCAAGCCAAT 1350 

GAACAGCGAT ATAGCAAGCT AAAGGAGAAG TACAGCGAGC TGGTTCAGAA 14 00 

CCACGCTGAC CTGCTGCGGA AGAATGCAGA GGTGACCAAA CAGGTGTCCA 1450 

TGGCCAGACA AGCCCAGGTA GATTTGGAAC GAGAGAAAAA AGAGCTGGAG 1500 

GATTCGTTGG AGCGCATCAG TGACCAGGGC CAGCGGAAGA CTCAAGAACA 1550 

GCTGGAAGTT CTAGAGAGCT TGAAGCAGGA ACTTGGCACA AGCCAACGGG 1600 

AGCTTCAGGT TCTGCAAGGC AGCCTGGAAA CTTCTGCCCA GTCAGAAGCA 1650 

AACTGGGCAG CCGAGTTCGC CGAGCTAGAG AAGGAGCGGG ACAGCCTGGT 1700 

GAGTGGCGCA GCTCATAGGG AGGAGGAATT ATCTGCTCTT CGGAAAGAAC 17 50 

TGCAGGACAC TCAGCTCAAA CTGGCCAGCA CAGAGGAATC TATGTGCCAG 1800 

CTTGCCAAAG ACCAACGAAA AATGCTTCTG GTGGGGTCCA GGAAGGCTGC 1850 

GGAGCAGGTG ATACAAGACG CCCTGAACCA GCTTGAAGAA CCTCCTCTCA 1900 

TCAGCTGCGC TGGGTCTGCA GATCACCTCC TCTCCACGGT CACATCCATT 1950 

TCCAGCTGCA TCGAGCAACT GGAGAAAAGC TGGAGCCAGT ATCTGGCCTG 2 000 

CCCAGAAGAC ATCAGTGGAC TTCTCCATTC CATAACCCTG CTGGCCCACT 2050 

TGACCAGCGA CGCCATTGCT CATGGTGCCA CCACCTGCCT CAGAGCCCCA 2100 

CCTGAGCCTG CCGACTCACT GACCGAGGCC TGTAAGCAGT ATGGCAGGGA 2150 

AACCCTCGCC TACCTGGCCT CCCTGGAGGA AGAGGGAAGC CTTGAGAATG 2200 

CCGACAGCAC AGCCATGAGG AACTGCCTGA GCAAGATCAA GGCCATCGGC 2250 

GAGGAGCTCC TGCCCAGGGG ACTGGACATC AAGCAGGAGG AGCTGGGGGA 2 300 

CCTGGTGGAC AAGGAGATGG CGGCCACTTC AGCTGCTATT GAAACTTGCA 2350 

CGGCCAGAAT AGAGGAGATG CTCAGCAAAT CCCGAGCAGG AGACACAGGA 2400 

GTCAAATTGG AGGTGAATGA AAGGATCCTT CGTTGCTGTA CCAGCCTCAT 245 0 

GCAAGCTATT CAGGTGCTCA TCGTGGCCTC TAAGGACCTC CAGAGAGAGA 2500 

TTGTGGAGAG CGGCAGGGGT ACAGCATCCC CTAAAGAGTT TTATGCCAAG 2 550 

AACTCTCGAT GGACAGAAGG ACTTATCTCA GCCTCCAAGG CTGTGGGCTG 2 600 

GGGAGCCACT GTCATGGTGG ATGCAGCTGA TCTGGTGGTA CAAGGCAGAG 2 65 0 

GGAAATTTGA GGAGCTAATG GTGTGTTCTC ATGAAATTGC TGCTAGCACA 27 00 

GCCCAGCTTG TGGCTGCATC CAAGGTGAAA GCTGATAAGG ACAGCCCCAA 2750 

CCTAGCCCAG CTGCAGCAGG CCTCTCGGGG AGTGAACCAG GCCACTGCCG 2 800 

GCGTTGTGGC CTCAACCATT TCCGGCAAAT CACAGATCGA AGAGACAGAC 2850 

AACATGGACT TCTCAAGCAT GACGCTGACA CAGATCAAAC GCCAAGAGAT 2900 

GGATTCTCAG GTTAGGGTGC TAGAGCTAGA AAATGAATTG CAGAAGGAGC 2950 

GTCAAAAACT GGGAGAGCTT CGGAAAAAGC ACTACGAGCT TGCTGGTGTT 3 000 

GCTGAGGGCT GGGAAGAAGG AACAGAGGCA TCTCCACCTA CACTGCAAGA 3050 

AGTGGTAACC GAAAAAGAAT AGAGCCAAAC CAACACCCCA TATGTCAGTG 3100 

TAAATCCTTG TTACCTATCT CGTGTGTGTT ATTTCCCCAG CCACAGGCCA 315 0 

AATCCTTGGA GTCCCAGGGG CAGCCACACC ACTGCCATTA CCCAGTGCCG 3200 

AGGACATGCA TGACACTTCC CAAAGATCCC TCCATAGCGA CACCCTTTCT 3250 

GTTTGGACCC ATGGTCATCT CTGTTCTTTT CCCGCCTCCC TAGTTAGCAT 3300 
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CCAGGCTGGC CAGTGCTGCC CATGAGCAAG CCTAGGTACG AAGAGGGGTG 33 50 

GTGGGGGGCA GGGCCACTCA ACAGAGAGGA CCAACATCCA GTCCTGCTGA 3400 

CTATTTGACC CCCACAACAA TGGGTATCCT TAATAGAGGA GCTGCTTGTT 345 0 

GTTTGTTGAC AGCTTGGAAA GGGAAGATCT TATGCCTTTT CTTTTCTGTT 3 500 

TTCTTCTCAG TCTTTTCAGT TTCATCATTT GCACAAACTT GTGAGCATCA 3550 

GAGGGCTGAT GGATTCCAAA CCAGGACACT ACCCTGAGAT CTGCACAGTC 3600 

AGAAGGACGG CAGGAGTGTC CTGGCTGTGA ATGCCAAAGC CATTCTCCCC 3 650 

CTCTTTGGGC AGTGCCATGG ATTTCCACTG CTTCTTATGG TGGTTGGTTG 37 00 

GGTTTTTTGG TTTTGTTTTT TTTTTTTAAG TTTCACTCAC ATAGCCAACT 3750 

CTCCCAAAGG GCACACCCCT GGGGCTGAGT CTCCAGGGCC CCCCAACTGT 3800 

GGTAGCTCCA GCGATGGTGC TGCCCAGGCC TCTCGGTGCT CCATCTCCGC 3 85 0 

CTCCACACTG ACCAAGTGCT GGCCCACCCA GTCCATGCTC CAGGGTCAGG 3900 

CGGAGCTGCT GAGTGACAGC TTTCCTCAAA AAGCAGAAGG AGAGTGAGTG 3950 

CCTTTCCCTC CTAAAGCTGA ATCCCGGCGG AAAGCCTCTG TCCGCCTTTA 4000 

CAAGGGAGAA GACAACAGAA AGAGGGACAA GAGGGTTCAC ACAGCCCAGT 4050 

TCCCGTGACG AGGCTCAAAA ACTTGATCAC ATGCTTGAAT GGAGCTGGTG 4100 

AGATCAACAA CACTACTTCC CTGCCGGAAT GAACTGTCCG TGAATGGTCT 4150 

CTGTCAAGCG GGCCGTCTCC CTTGGCCCAG AGACGGAGTG TGGGAGTGAT 42 00 

TCCCAACTCC TTTCTGCAGA CGTCTGCCTT GGCATCCTCT TGAATAGGAA 42 50 

GATCGTTCCA CTTTCTACGC AATTGACAAA CCCGGAAGAT CAGATGCAAT 4300 

TGCTCCCATC AGGGAAGAAC CCTATACTTG GTTTGCTACC CTTAGTATTT 43 50 

ATTACTAACC TCCCTTAAGC AGCAACAGCC TACAAAGAGA TGCTTGGAGC 4400 

AATCAGAACT TCAGGTGTGA CTCTAGCAAA GCTCATCTTT CTGCCCGGCT 4450 

ACATCAGCCT TCAAGAATCA GAAGAAAGCC AAGGTGCTGG ACTGTTACTG 45 00 

ACTTGGATCC CAAAGC7\AGG AGATCATTTG GAGCTCTTGG GTCAGAGAAA 4550 

ATGAGAAAGG ACAGAGCCAG CGGCTCCAAC TCCTTTCAGC CACATGCCCC 4600 

AGGCTCTCGC TGCCCTGTGG ACAGGATGAG GACAGAGGGC ACATGAACAG 4650 

CTTGCCAGGG ATGGGCAGCC CAACAGCACT TTTCCTCTTC TAGATGGACC 47 00 

CCAGCATTTA AGTGACCTTC TGATCTTGGG AAAACAGCGT CTTCCTTCTT 4750 

TATCTATAGC AACTCATTGG TGGTAGCCAT CAAGCACTTC GGAATT 4796 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 924 

(B) TYPE: protein 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: hunian 

(ix) FEATURE: Huntingtin-interacting protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Met Ser Arg Met Trp Gly His Leu Ser Glu Gly Tyr Gly Gin Leu 
15 10 15 



Cys Ser lie Tyr Leu Lys Leu Leu Arg Thr Lys Met Glu Tyr His 

20 25 30 
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Thr Lys Asn Pro Arg Phe Pro Gly Asn Leu Gin Met Ser Asp Arg 

35 40 45 

Gin Leu Asp Glu Ala Gly Glu Ser Asp Val Asn Asn Phe Phe Gin 

50 55 60 

Leu Thr Val Glu Met Phe Asp Tyr Leu Glu Cys Glu Leu Asn Leu 

65 70 75 

Phe Gin Thr Val Phe Asn Ser Leu Asp Met Ser Arg Ser Val Ser 

80 85 90 

Val Thr Ala Ala Gly Gin Cys Arg Leu Ala Pro Leu lie Gin Val 

95 100 105 

lie Leu Asp Cys Ser His Leu Tyr Asp Tyr Thr Val Lys Leu Leu 

110 115 120 

Phe Lys Leu His Ser Cys Leu Pro Ala Asp Thr Leu Gin Gly His 

125 130 135 

Arg Asp Arg Phe Met Glu Gin Phe Thr Lys Leu Lys Asp Leu Phe 

140 145 150 

Tyr Arg Ser Ser Asn Leu Gin Tyr Phe Lys Arg Leu lie Gin lie 

155 160 165 

Pro Gin Leu Pro Glu Asn Pro Pro Asn Phe Leu Arg Ala Ser Ala 

170 175 180 

Leu Ser Glu His lie Ser Pro Val Val Val lie Pro Ala Glu Ala 

185 190 195 

Ser Ser Pro Asp Ser Glu Pro Val Leu Glu Lys Asp Asp Leu Met 

200 205 210 

Asp Met Asp Ala Ser Gin Gin Asn Leu Phe Asp Asn Lys Phe Asp 

215 220 225 

Asp lie Phe Gly Ser Ser Phe Ser Ser Asp Pro Phe Asn Phe Asn 

230 235 240 

Ser Gin Asn Gly Val Asn Lys Asp Glu Lys Asp His Leu lie Glu 

245 250 255 

Arg Leu Tyr Arg Glu lie Ser Gly Leu Lys Ala Gin Leu Glu Asn 



260 



265 



270 



Met Lys Thr Glu Ser Gin Arg Val Val Leu Gin Leu Lys Gly His 

275 280 285 
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Val Ser Glu Leu Glu Ala Asp Leu Ala Glu Gin Gin His Leu Arg 

290 295 300 

Gin Gin Ala Ala Asp Asp Cys Glu Phe Leu Arg Ala Glu Leu Asp 

305 310 315 

Glu Leu Arg Arg Gin Arg Glu Asp Thr Glu Lys Ala Gin Arg Ser 

320 325 330 

Leu Ser Glu lie Glu Arg Lys Ala Gin Ala Asn Glu Gin Arg Tyr 

335 340 345 

Ser Lys Leu Lys Glu Lys Tyr Ser Glu Leu Val Gin Asn His Ala 

350 355 360 

Asp Leu Leu Arg Lys Asn Ala Glu Val Thr Lys Gin Val Ser Met 

365 370 375 

Ala Arg Gin Ala Gin Val Asp Leu Glu Arg Glu Lys Lys Glu Leu 

380 385 390 

Glu Asp Ser Leu Glu Arg He Ser Asp Gin Gly Gin Arg Lys Thr 

395 400 405 

Gin Glu Gin Leu Glu Val Leu Glu Ser Leu Lys Gin Glu Leu Gly 

410 415 420 

Thr Ser Gin Arg Glu Leu Gin Val Leu Gin Gly Ser Leu Glu Thr 

425 430 435 

Ser Ala Gin Ser Glu Ala Asn Trp Ala Ala Glu Phe Ala Glu Leu 

440 445 450 

Glu Lys Glu Arg Asp Ser Leu Val Ser Gly Ala Ala His Arg Glu 

455 460 465 

Glu Glu Leu Ser Ala Leu Arg Lys Glu Leu Gin Asp Thr Gin Leu 

470 475 480 

Lys Leu Ala Ser Thr Glu Glu Ser Met Cys Gin Leu Ala Lys Asp 

485 490 495 

Gin Arg Lys Met Leu Leu Val Gly Ser Arg Lys Ala Ala Glu Gin 

500 505 510 

Val He Gin Asp Ala Leu Asn Gin Leu Glu Glu Pro Pro Leu He 

515 520 525 

Ser Cys Ala Gly Ser Ala Asp His Leu Leu Ser Thr Val Thr Ser 



530 



535 



540 
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lie Ser Ser Cys lie Glu Gin Leu Glu Lys Ser Trp Ser Gin Tyr 

545 550 555 

Leu Ala Cys Pro Glu Asp lie Ser Gly Leu Leu His Ser lie Thr 

560 565 570 

Leu Leu Ala His Leu Thr Ser Asp Ala lie Ala His Gly Ala Thr 

575 580 585 

Thr Cys Leu Arg Ala Pro Pro Glu Pro Ala Asp Ser Leu Thr Glu 

590 595 600 

Ala Cys Lys Gin Tyr Gly Arg Glu Thr Leu Ala Tyr Leu Ala Ser 

605 610 615 

Leu Glu Glu Glu Gly Ser Leu Glu Asn Ala Asp Ser Thr Ala Met 



Arg Asn Cys Leu Ser Lys lie Lys Ala lie Gly Glu Glu Leu Leu 

635 640 645 

Pro Arg Gly Leu Asp lie Lys Gin Glu Glu Leu Gly Asp Leu Val 

650 655 660 

Asp Lys Glu Met Ala Ala Thr Ser Ala Ala lie Glu Thr Cys Thr 

665 670 675 

Ala Arg lie Glu Glu Met Leu Ser Lys Ser Arg Ala Gly Asp Thr 

680 685 690 

Gly Val Lys Leu Glu Val Asn Glu Arg lie Leu Arg Cys Cys Thr 

695 700 705 

Ser Leu Met Gin Ala lie Gin Val Leu lie Val Ala Ser Lys Asp 

710 715 720 

Leu Gin Arg Glu lie Val Glu Ser Gly Arg Gly Thr Ala Ser Pro 

725 730 735 

Lys Glu Phe Tyr Ala Lys Asn Ser Arg Trp Thr Glu Gly Leu lie 

740 745 750 

Ser Ala Ser Lys Ala Val Gly Trp Gly Ala Thr Val Met Val Asp 

765 770 775 

Ala Ala Asp Leu Val Val Gin Gly Arg Gly Lys Phe Glu Glu Leu 



620 



625 



630 



780 



785 



790 



Met Val Cys Ser His Glu lie Ala Ala Ser Thr Ala Gin Leu Val 

795 800 805 
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Ala Ala Ser Lys Val Lys Ala Asp Lys Asp Ser Pro Asn Leu Ala 

810 815 820 



Gin Leu Gin Gin Ala Ser Arg Gly Val Asn Gin Ala Thr Ala Gly 

825 830 835 

Val Val Ala Ser Thr lie Ser Gly Lys Ser Gin lie Glu Glu Thr 

840 845 850 

Asp Asn Met Asp Phe Ser Ser Met Thr Leu Thr Gin lie Lys Arg 

855 860 865 

Gin Glu Met Asp Ser Gin Val Arg Val Leu Glu Leu Glu Asn Glu 

870 875 880 

Leu Gin Lys Glu Arg Gin Lys Leu Gly Glu Leu Arg Lys Lys His 

885 890 895 

Tyr Glu Leu Ala Gly Val Ala Glu Gly Trp Glu Glu Gly Thr Glu 

900 905 910 

Ala Ser Pro Pro Thr Leu Gin Glu Val Val Thr Glu Lys Glu 

915 920 924 



(2) INFORMATION FOR SEQ ID NO: 5 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1090 

(B) TYPE: protein 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(ix) FEATURE: Huntingtin-interacting protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

Met Leu Leu Cys Gin Gly Ser Glu Trp Arg Arg Asp Gin Gin Leu 

5 10 15 

Gly Thr Ala Asn Ala Arg Gin Trp Cys Pro Leu Pro Gin Asp Ala 

20 25 30 

Gin Pro Ala Gly Ser Trp Glu Arg Cys Pro Pro Leu Pro Pro Ala 

35 40 45 



Gly Arg Leu Gin Gly Thr Asp His Pro Trp Gly Trp Gly Arg Leu 

50 55 60 
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Ala Gly Gly Gly Glu Arg Gly Gly Leu Trp Glu Gly Leu Ser His 

65 70 75 

Ser Gin Arg Leu lie His Leu lie Leu Leu Ser Leu Pro Leu Leu 

80 85 90 

Val Phe Gin Thr Val Ser lie Asn Lys Ala lie Asn Thr Gin Glu 

95 100 105 

Val Ala Val Lys Glu Lys His Ala Arg Thr Cys lie Leu Gly Thr 

110 115 120 

His His Glu Lys Gly Ala Gin Thr Phe Trp Ser Val Val Asn Arg 

125 130 135 

Leu Pro Leu Ser Ser Asn Ala Val Leu Cys Trp Lys Phe Cys His 

140 145 150 

Val Phe His Lys Leu Leu Arg Asp Gly His Pro Asn Val Leu Lys 

155 160 165 

Asp Ser Leu Arg Tyr Arg Asn Glu Leu Ser Asp Met Ser Arg Met 

170 175 180 

Trp Gly His Leu Ser Glu Gly Tyr Gly Gin Leu Cys Ser lie Tyr 

185 190 195 

Leu Lys Leu Leu Arg Thr Lys Met Glu Tyr His Thr Lys Asn Pro 

200 205 210 

Arg Phe Pro Gly Asn Leu Gin Met Ser Asp Arg Gin Leu Asp Glu 

215 220 225 

Ala Gly Glu Ser Asp Val Asn Asn Phe Phe Gin Leu Thr Val Glu 

230 235 240 

Met Phe Asp Tyr Leu Glu Cys Glu Leu Asn Leu Phe Gin Thr Val 

245 250 255 

Phe Asn Ser Leu Asp Met Ser Arg Ser Val Ser Val Thr Ala Ala 

260 265 270 

Gly Gin Cys Arg Leu Ala Pro Leu lie Gin Val lie Leu Asp Cys 



Ser His Leu Tyr Asp Tyr Thr Val Lys Leu Leu Phe Lys Leu His 



275 



288 



285 



290 



295 



300 



Ser Cys Leu Pro Ala Asp Thr Leu Gin Gly His Arg Asp Arg Phe 

305 310 315 
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Met Glu Gin Phe Thr Lys Leu Lys Asp Leu Phe Tyr Arg Ser Ser 

320 325 330 

Asn Leu Gin Tyr Phe Lys Arg Leu lie Gin lie Pro Gin Leu Pro 

335 340 345 

Glu Asn Pro Pro Asn Phe Leu Arg Ala Ser Ala Leu Ser Glu His 

350 355 360 

lie Ser Pro Val Val Val lie Pro Ala Glu Ala Ser Ser Pro Asp 

365 370 375 

Ser Glu Pro Val Leu Glu Lys Asp Asp Leu Met Asp Met Asp Ala 

380 385 390 

Ser Gin Gin Asn Leu Phe Asp Asn Lys Phe Asp Asp lie Phe Gly 

395 400 405 

Ser Ser Phe Ser Ser Asp Pro Phe Asn Phe Asn Ser Gin Asn Gly 

410 415 420 

Val Asn Lys Asp Glu Lys Asp His Leu lie Glu Arg Leu Tyr Arg 

425 430 435 

Glu lie Ser Gly Leu Lys Ala Gin Leu Glu Asn Met Lys Thr Glu 

440 445 450 

Ser Gin Arg Val Val Leu Gin Leu Lys Gly His Val Ser Glu Leu 

455 460 465 

Glu Ala Asp Leu Ala Glu Gin Gin His Leu Arg Gin Gin Ala Ala 

470 475 480 

Asp Asp Cys Glu Phe Leu Arg Ala Glu Leu Asp Glu Leu Arg Arg 

485 490 495 

Gin Arg Glu Asp Thr Glu Lys Ala Gin Arg Ser Leu Ser Glu lie 

500 505 510 

Glu Arg Lys Ala Gin Ala Asn Glu Gin Arg Tyr Ser Lys Leu Lys 

515 520 525 

Glu Lys Tyr Ser Glu Leu Val Gin Asn His Ala Asp Leu Leu Arg 

530 535 540 

Lys Asn Ala Glu Val Thr Lys Gin Val Ser Met Ala Arg Gin Ala 

545 550 555 



Gin Val Asp Leu Glu Arg Glu Lys Lys Glu Leu Glu Asp Ser Leu 

560 565 570 
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Glu Arg lie Ser Asp Gin Gly Gin Arg Lys Thr Gin Glu Gin Leu 

575 588 585 

Glu Val Leu Glu Ser Leu Lys Gin Glu Leu Ala Thr Ser Gin Arg 

590 595 600 

Glu Leu Gin Val Leu Gin Gly Ser Leu Glu Thr Ser Ala Gin Ser 

605 610 615 

Glu Ala Asn Trp Ala Ala Glu Phe Ala Glu Leu Glu Lys Glu Arg 

620 625 630 

Asp Ser Leu Val Ser Gly Ala Ala His Arg Glu Glu Glu Leu Ser 

635 640 645 

Ala Leu Arg Lys Glu Leu Gin Asp Thr Gin Leu Lys Leu Ala Ser 

650 655 660 

Thr Glu Glu Ser Met Cys Gin Leu Ala Lys Asp Gin Arg Lys Met 

665 670 675 

Leu Leu Val Gly Ser Arg Lys Ala Ala Glu Gin Val lie Gin Asp 

680 685 690 

Ala Leu Asn Gin Leu Glu Glu Pro Pro Leu lie Ser Cys Ala Gly 

695 700 705 

Ser Ala Asp His Leu Leu Ser Thr Val Thr Ser lie Ser Ser Cys 

710 715 720 

lie Glu Gin Leu Glu Lys Ser Trp Ser Gin Tyr Leu Ala Cys Pro 

725 730 735 

Glu Asp lie Ser Gly Leu Leu His Ser lie Thr Leu Leu Ala His 

740 745 750 

Leu Thr Ser Asp Ala lie Ala His Gly Ala Thr Thr Cys Leu Arg 

755 760 765 

Ala Pro Pro Glu Pro Ala Asp Ser Leu Thr Glu Ala Cys Lys Gin 

770 775 780 

Tyr Gly Arg Glu Thr Leu Ala Tyr Leu Ala Ser Leu Glu Glu Glu 

785 790 795 

Gly Ser Leu Glu Asn Ala Asp Ser Thr Ala Met Arg Asn Cys Leu 



800 



805 



810 



Ser Lys lie Lys Ala lie Gly Glu Glu Leu Leu Pro Arg Gly Leu 

815 820 825 
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Asp lie Lys Gin Glu Glu Leu Gly Asp Leu Val Asp Lys Glu Met 

830 835 840 

Ala Ala Thr Ser Ala Ala lie Glu Thr Ala Thr Ala Arg lie Glu 

845 850 855 

Glu Met Leu Ser Lys Ser Arg Ala Gly Asp Thr Gly Val Lys Leu 

860 865 870 

Glu Val Asn Glu Arg lie Leu Gly Cys Cys Thr Ser Leu Met Gin 

875 888 885 

Ala lie Gin Val Leu lie Val Ala Ser Lys Asp Leu Gin Arg Glu 

890 895 900 

lie Val Glu Ser Gly Arg Gly Thr Ala Ser Pro Lys Glu Phe Tyr 

905 910 915 

Ala Lys Asn Ser Arg Trp Thr Glu Gly Leu lie Ser Ala Ser Lys 

920 925 930 

Ala Val Gly Trp Gly Ala Thr Val Met Val Asp Ala Ala Asp Leu 

935 940 945 

Val Val Gin Gly Arg Gly Lys Phe Glu Glu Leu Met Val Cys Ser 

950 955 960 

His Glu lie Ala Ala Ser Thr Ala Gin Leu Val Ala Ala Ser Lys 

965 970 975 

Val Lys Ala Asp Lys Asp Ser Pro Asn Leu Ala Gin Leu Gin Gin 

980 985 990 

Ala Ser Arg Gly Val Asn Gin Ala Thr Ala Gly Val Val Ala Ser 

995 1000 1005 

Thr lie Ser Gly Lys Ser Gin lie Glu Glu Thr Asp Asn Met Asp 
1010 1015 1020 

Phe Ser Ser Met Thr Leu Thr Gin lie Lys Arg Gin Glu Met Asp 
1025 1030 1035 

Ser Gin Val Arg Val Leu Glu Leu Glu Asn Glu Leu Gin Lys Glu 



Arg Gin Lys Leu Gly Glu Leu Arg Lys Lys His Tyr Glu Leu Ala 



1040 



1045 



1050 



1055 



1060 



1065 



Gly Val Ala Glu Gly Trp Glu Glu Gly Thr Glu Ala Ser Pro Pro 
1070 1075 1080 
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Thr Leu Gin Glu Val Val Thr Glu Lys Glu 
1085 1090 



(2) INFORMATION FOR SEQ ID NO:6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3301 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULETYPE: cDNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(ix) FEATURE: cDNA for Huntingtin-interacting protein 
(xi)SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

CGGTGAGCTG GAGGAGCAGC GGAAGCAGAA GCAGAAGGCC CTGGTGGATA 50 

ATGAGCAGCT CCGCCACGAG CTGGCCCAGC TGAGGGCTGC CCAGCTGGAG 100 

CGCGAGCGGA GCCAGGGCCT GCGTGAGGAG GCTGAGAGGA AGGCCAGTGC 150 

CACGGAGGCG CGCTACAACA AGCTGAAGGA AAAGCACAGT GAGCTCGTCC 200 

ATGTGCACGC GGAGCTGCTC AGAAAGAACG CGGACACAGC CAAGCAGCTG 250 

ACGGTGACGC AGCAAAGCCA GGAGGAGGTG GCGCGGGTGA AGGAGCAGCT 300 

GGCCTTCCAG GTGGAGCAGG TGAAGCGGGA GTCGGAGTTG AAGCTAGAGG 3 50 

AGAAGAGCGA CCAGCAGGAG AAGCTCAAGA GGGAGCTGGA GGCCAAGGCC 400 

GGAGAGCTGG CCCGCGCGCA GGAGGCCCTG AGCCACACAG AGCAGAGCAA 450 

GTCGGAGCTG AGCTCACGGC TGGACACACT GAGTGCGGAG AAGGATGCTC 500 

TGAGTGGAGC TGTGCGGCAG CGGGAGGCAG ACCTGCTGGC GGCGCAGAGC 550 

CTGGTGCGCG AGACAGAGGC GGCGCTGAGC CGGGAGCAGC AGCGCAGCTC 600 

CCAGGAGCAG GGCGAGTTGC AGGGCCGGCT GGCAGAGAGG GAGTCTCAGG 650 

AGCAGGGGCT GCGGCAGAGG CTGCTGGACG AGCAGTTCGC AGTGTTGCGG 700 

GGCGCTGCTG CCGAGGCCGC GGGCATCCTG CAGGATGCCG TGAGCAAGCT 750 

GGACGACCCC CTGCACCTGC GCTGTACCAG CTCCCCAGAC TACCTGGTGA 800 

GCAGGGCCCA GGAGGCCTTG GATGCCGTGA GCACCCTGGA GGAGGGCCAC 850 

GCCCAGTACC TGACCTCCTT GGCAGACGCC TCCGCCCTGG TGGCAGCTCT 900 

GACCCGCTTC TCCCACCTGG CTGCGGATAC CATCATCAAT GGCGGTGCCA 950 

CCTCGCACCT GGCTCCCACC GACCCTGCCG ACCGCCTCAT AGACACCTGC 1000 

AGGGAGTGCG GGGCCCGGGC TCTGGAGCTC ATGGGGCAGC TGCAGGACCA 1050 

GCAGGCTCTG CGGCACATGC AGGCCAGCCT GGTGCGGACA CCCCTGCAGG 1100 

GCATCCTTCA GCTGGGCCAA GAACTGAAAC CCAAGAGCCT AGATGTGCGG 1150 

CAGGAGGAGC TGGGGGCCGT GGTCGACAAG GAGATGGCGG CCACATCCGC 1200 

AGCCATTGAA GATGCTGTGC GGAGGATTGA GGACATGATG AACCAGGCAC 1250 

GCCACGCCAG CTCGGGGGTG AAGCTGGAGG TGAACGAGAG GATCCTCAAC 1300 

TCCTGCACAG ACCTGATGAA GGCTATCCGG CTCCTGGTGA CGACATCCAC 1350 

TAGCCTGCAG AAGGAGATCG TGGAGAGCGG CAGGGGGGCA GCCACGCAGC 1400 

AGGAATTTTA CGCCTUVGAAC TCGCGCTGGA CCGAAGGCCT CATCTCGGCC 1450 

TCCAAGGCTG TGGGCTGGGG AGCCACACAG CTGGTGGAGG CAGCTGACAA 1500 

GGTGGTGCTT CACACGGGCA AGTATGAGGA GCTCATCGTC TGCTCCCACG 1550 

AGATCGCAGC CAGCACGGCC CAGCTGGTGG CGGCCTCCAA GGTGAAGGCC 1600 



wo 99/60986 




PCT/US99/11743 



AACAAGCACA GCCCCCACCT GAGCCGCCTG CAGGAATGTT CTCGCACAGT 1650 
CAATGAGAGG GCTGCCAATG TGGTGGCCTC CACCAAGTCA GGCCAGGAGC 1700 
AGATTGAGGA CAGAGACACC ATGGATTTCT CCGGCCTGTC CCTCATCAAG 1750 
CTGAAGAAGC AGGAGATGGA GACGCAGGTG CGTGTCCTGG AGCTGGAGAA 1800 
GACGCTGGAG GCTGAACGCA TGCGGCTGGG GGAGTTGCGG AAGCAACACT 1850 
ACGTGCTGGC TGGGGCATCA GGCAGCCCTG GAGAGGAGGT GGCCATCCGG 1900 
CCCAGCACTG CCCCCCGAAG TGTAACCACC AAGAAACCAC CCCTGGCCCA 1950 
GAAGCCCAGC GTGGCCCCCA GACAGGACCA CCAGCTTGAC AAAAAGGATG 2000 
GCATCTACCC AGCTCAACTC GTGAACTACT AGGCCCCCCA GGGGTCCAGC 2050 
AGGGTGGCTG GTGACAGGCC TGGGCCTCTG CAACTGCCCT GACAGGACCG 2100 
AGAGGCCTTG CCCCTCCACC TGGTGCCCAA GCCTCCCGCC CCACCGTCTG 2150 
GATCAATGTC CTCAAGGCCC CTGGCCCTTA CTGAGCCTGC AGGGTCCTGG 2200 
GCCATGTGGG TGGTGCTTCT GGATGTGAGT CTCTTATTTA TCTGCAGAAG 2250 
GAACTTTGGG GTGCAGCCAG GACCCGGTAG GCCTGAGCCT CAACTCTTCA 23 00 
GAAAATAGTG TTTTTAATAT TCCTCTTCAG AAAATAGTGT TTTTAATATT 2350 
CCGAGCTAGA GCTCTTCTTC CTACGTTTGT AGTCAGCACA CTGGGAAACC 2400 
GGGCCAGCGT GGGGCTCCCT GCCTTCTGGA CTCCTGAAGG TCGTGGATGG 2450 
ATGGAAGGCA CACAGCCCGT GCCGGCTGAT GGGACGAGGG TCAGGCATCC 2500 
TGTCTGTGGC CTTCTGGGGC ACCGATTCTA CCAGGCCCTC CAGCTGCGTG 2550 
GTCTCCGCAG ACCAGGCTCT GTGTGGGCTA GAGGAATGTC GCCCATTACC 2600 
TCCTCAGGCC CTGGCCCTCG GGCCTCCGTG ATGGGAGCCC CCCAGGAGGG 2700 
GTCAGATGCT GGAAGGGGCC GCTTTCTGGG GAGTGAGGTG AGACATAGCG 2750 
GCCCAGGCGC TGCCTTCACT CCTGGAGTTT CCATTTCCAG CTGGAATCTG 2800 
CAGCCACCCC CATTTCCTGT TTTCCATTCC CCCGTTCTGG CCGCGCCCCA 285 0 
CTGCCCACCT GAAGGGGTGG TTTCCAGCCC TCCGGAGAGT GGGCTTGGCC 2900 
CTAGGCCCTC CAGCTCAGCC AGAAAAAGCC CAGAAACCCA GGTGCTGGAC 2950 
CAGGGCCCTC AGGGAGGGAC CCTGCGGCTA GAGTGGGCTA GGCCCTGGCT 3000 
TTGCCCGTCA GATTTGAACG AATGTGTGTC CCTTGAGCCC AAGGAGAGCG 3050 
GCAGGAGGGG TGGGACCAGG CTGGGAGGAC AGAGCCAGCA GCTGCCATGC 3100 
CCTCCTGCTC CCCCCACCCC AGCCCTAGCC CTTTAGCCTT TCACCCTGTG 3150 
CTCTGGAAAG GCTACCAAAT ACTGGCCAAG GTCAGGAGGA GCAAAAATGA 3200 
GCCAGCACCA GCGCCTTGGC TTTGTGTTAG CATTTCCTCC TGAAGTGTTC 3250 
TGTTGGCAAT AAAATGCACT TTGACTGTTA AAAAAAAAAA AAAAAAAAAA 3300 



(2) INFORMATION FOR SEQ ID NO: 7 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 676 

(B) TYPE: protein 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(ix) FEATURE: Huntingtin-interacting protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

Gly Glu Leu Glu Glu Gin Arg Lys Gin Lys Gin Lys Ala Leu Val 



A 



3301 



5 



10 



15 
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Asp Asn Glu Gin Leu Arg His Glu Leu Ala Gin Leu Arg Ala Ala 

20 25 30 

Gin Leu Glu Arg Glu Arg Ser Gin Gly Leu Arg Glu Glu Ala Glu 

35 40 45 

Arg Lys Ala Ser Ala Thr Glu Ala Arg Tyr Asn Lys Leu Lys Glu 

50 55 60 

Lys His Ser Glu Leu Val His Val His Ala Glu Leu Leu Arg Lys 

65 70 75 

Asn Ala Asp Thr Ala Lys Gin Leu Thr Val Thr Gin Gin Ser Gin 

80 85 90 

Glu Glu Val Ala Arg Val Lys Glu Gin Leu Ala Phe Gin Val Glu 

95 100 105 

Gin Val Lys Arg Glu Ser Glu Leu Lys Leu Glu Glu Lys Ser Asp 

110 115 120 

Gin Gin Glu Lys Leu Lys Arg Glu Leu Glu Ala Lys Ala Gly Glu 

125 130 135 

Leu Ala Arg Ala Gin Glu Ala Leu Ser His Thr Glu Gin Ser Lys 

140 145 150 

Ser Glu Leu Ser Ser Arg Leu Asp Thr Leu Ser Ala Glu Lys Asp 

155 160 165 

Ala Leu Ser Gly Ala Val Arg Gin Arg Glu Ala Asp Leu Leu Ala 

170 175 180 

Ala Gin Ser Leu Val Arg Glu Thr Glu Ala Ala Leu Ser Arg Glu 

185 190 195 

Gin Gin Arg Ser Ser Gin Glu Gin Gly Glu Leu Gin Gly Arg Leu 

200 205 210 

Ala Glu Arg Glu Ser Gin Glu Gin Gly Leu Arg Gin Arg Leu Leu 

215 220 225 

Asp Glu Gin Phe Ala Val Leu Arg Gly Ala Ala Ala Glu Ala Ala 

230 235 240 

Gly lie Leu Gin Asp Ala Val Ser Lys Leu Asp Asp Pro Leu His 

245 250 255 



Leu Arg Cys Thr Ser Ser Pro Asp Tyr Leu Val Ser Arg Ala Gin 

260 265 270 
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Glu Ala Leu Asp Ala Val Ser Thr Leu Glu Glu Gly His Ala Gin 

275 288 285 

Tyr Leu Thr Ser Leu Ala Asp Ala Ser Ala Leu Val Ala Ala Leu 

290 295 300 

Thr Arg Phe Ser His Leu Ala Ala Asp Thr lie lie Asn Gly Gly 

305 310 315 

Ala Thr Ser His Leu Ala Pro Thr Asp Pro Ala Asp Arg Leu lie 

320 325 330 

Asp Thr Cys Arg Glu Cys Gly Ala Arg Ala Leu Glu Leu Met Gly 

335 340 345 

Gin Leu Gin Asp Gin Gin Ala Leu Arg His Met Gin Ala Ser Leu 

350 355 360 

Val Arg Thr Pro Leu Gin Gly lie Leu Gin Leu Gly Gin Glu Leu 

365 370 375 

Lys Pro Lys Ser Leu Asp Val Arg Gin Glu Glu Leu Gly Ala Val 

380 385 390 

Val Asp Lys Glu Met Ala Ala Thr Ser Ala Ala lie Glu Asp Ala 

395 400 405 

Val Arg Arg lie Glu Asp Met Met Asn Gin Ala Arg His Ala Ser 

410 415 420 

Ser Gly Val Lys Leu Glu Val Asn Glu Arg lie Leu Asn Ser Cys 

425 430 435 

Thr Asp Leu Met Lys Ala lie Arg Leu Leu Val Thr Thr Ser Thr 

440 445 450 

Ser Leu Gin Lys Glu lie Val Glu Ser Gly Arg Gly Ala Ala Thr 

455 460 465 

Gin Gin Glu Phe Tyr Ala Lys Asn Ser Arg Trp Thr Glu Gly Leu 

470 475 480 

lie Ser Ala Ser Lys Ala Val Gly Trp Gly Ala Thr Gin Leu Val 

485 490 495 

Glu Ala Ala Asp Lys Val Val Leu His Thr Gly Lys Tyr Glu Glu 

500 505 510 

Leu lie Val Cys Ser His Glu lie Ala Ala Ser Thr Ala Gin Leu 



515 



520 



525 
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Val Ala Ala Ser Lys Val Lys Ala Asn Lys His Ser Pro His Leu 



530 



535 



540 



Ser Arg Leu Gin Glu Cys Ser Arg Thr Val Asn Glu Arg Ala Ala 



545 



550 



555 



Asn Val Val Ala Ser Thr Lys Ser Gly Gin Glu Gin lie Glu Asp 



560 



565 



570 



Arg Asp Thr Met Asp Phe Ser Gly Leu Ser Leu lie Lys Leu Lys 



575 



588 



585 



Lys Gin Glu Met Glu Thr Gin Val Arg Val Leu Glu Leu Glu Lys 



590 



595 



600 



Thr Leu Glu Ala Glu Arg Met Arg Leu Gly Glu Leu Arg Lys Gin 



605 



610 



615 



His Tyr Val Leu Ala Gly Ala Ser Gly Ser Pro Gly Glu Glu Val 



620 



625 



630 



Ala lie Arg Pro Ser Thr Ala Pro Arg Ser Val Thr Thr Lys Lys 



635 



640 



645 



Pro Pro Leu Ala Gin Lys Pro Ser Val Ala Pro Arg Gin Asp His 



650 



655 



660 



Gin Leu Asp Lys Lys Asp Gly lie Tyr Pro Ala Gin Leu Val Asn 



665 



670 



675 



Tyr 



(2) INFORMATION FOR SEQ ID NO:8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2338 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: mouse 

(ix) FEATURE: cDNA for Huntingtin-interaciing protein - mHEPl 
(xi)SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

GGCACGAGGG CTCATTCAGA TCCCCCAGCT GCCCGAGAAT CCACCCAACTT 50 
CCTACGAGCC TCGGCCCTGT CAGAGCACAT CAGTCCTGTG GTGGTGATCCC 100 
GGCAGAGGTG TCATCCCCAG ACAGTGAGCC TGTCCTGGAG AAGGATGACCT 150 
CATGGACATG GACGCCTCCC AGCAGACTTT GTTTGACAAC AAGTTTGATGA 200 
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CGTCTTTGGC AGCTCATTGA GCAGCGACCC 
TGGCGTGAAC AAGGACGAGA AGGACCACTT 
GATCAGTGGA CTGACAGGGC AGCTGGACAA 
GGCCATGCTG CAGCTGAAGG GTCGAGTGAG 
AGAGCAGCAG CACTTGGGCC GGCAGGCTAT 
CACTGAGCTG GATGAACTGA AGAGGCAGCG 
GCGCAGCCTG ACTGAGATAG AAAGAAAGGC 
TAGCAAGTTA AAAGAGAAGT ACAGTGAACT 
GCTGCGGAAG AACGCAGAGG TGACCAAACA 
CCAGGTGGAT TTGGAAAGAG AGAAAAAAGA 
GTGTAAGTGA CCAGGCCCAG CGGAAGACTC 
GAGAACCTGA AGCATGAACT GGCCACCAGC 
CCACAGCAAC CTGGAAACCT CTGCCCAGTC 
AGATCGCCGA GTTGGAGAAG GAACAAGGCA 
CAGAGAGAGG AAGAGTTATC AGCCCTCCGA 
GATCAAGCTG GCTGGGGCCC AGGAATCCAT 
AGAGGAAAAC CCTCTTGGCA GGGATCAGGA 
CAGGAGGCGC TGAGCCAGCT TGAGGAACCC 
ATCCACAGAT CACCTTCTCT CCAAAGTCAG 
AGCAACTGGA AAAGAACGGC AGCCAGTATC 
AGTGAGCTTC TGCACTCGAT CACCCTGCTT 
TGTCATCCAG GGGAGTGCCA CCAGCCTCCG 
ACTCGTTGAC GGAGGCCTGT AGGCAGTATG 
CTGTCCTCCC TGGAGGAAGA GGGAACTGTG 
CCTTAGGAAT TGCCTCAGCA GGGTCAAGAC 
CCAGGGGCCT GGACATCAAG CAGGAAGAGC 
GAGATGGCAG CCACTTCAGC TGCCATTGAA 
GGAAATTCTC AGTAAGTCCC GAGCAGGAGA 
TGAATGAGAG GATCCTGGGT TCCTGTACCA 
GTGCTCGTTG TGGCCTCCAA GGACCTCCAG 
CAGGGGTAGT GCATCCCCTA AAGAATTTTA 
CGGAAGGGCT GATATCCGCC TCCAAAGCTG 
ATGGTGGATG CTGCTGATCT TGTGGTCCAA 
GCTGATGGTG TGTTCACGCG AGATTGCTGC 
CTGCATCCAA GGTGAAAGCG AACAAGGGCA 
CAGCAGGCCT CTCGAGGAGT GAACCAGGCC 
AACCATTTCT GGCAAATCTC AGATTGAGGA 
CAAGCATGAC ACTGACCCAG ATCAAGCGCC 
AGGGTGCTGG AGCTGGAAAA TGACCTGCAG 
AGAGCTACGG AAGAAACACT ACGAGCTGGA 
AGGAAGGGAC AGAAGCATCA CCGTCTACTG 
AAAGAGTAGA GCCAAGCCGA CACCCCACAC 
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TTTCAATTTC AACAATCAAAA 250 
GATTGAACGC CTGTACAGAGA 300 
CATGAAGATT GAGAGCCAGCG 350 
TGAGCTGGAG GCAGAGCTAGC 400 
GGATGACTGC GAGTTCCTGCG 450 
AGAGGACACG GAGAAGGCACA 500 
CCAGGCTAAT GAACAGAGGTA 550 
GGTGCAGAAC CATGCTGACCT 600 
GGTGTCCGTG GCCCGGCAAGC 650 
GCTAGCAGAT TCCTTTGCAC 7 00 
AAGAGCAACA GGATGTTCTA 750 
AGACAGGAGC TGCAGGTCCT 800 
AGAAGCGAAA TGGCTGACAC 85 0 
GCTTGGCGAC TGTTGCAGCT 900 
GACCAGCTGG AAAGCACCCA 950 
GTGCCAGCAG GTGAAGGACC 1000 
AGGCTGCGGA GCGTGAGATA 1050 
ACCCTCATCA GCTGTGCAGG 1100 
CTCCGTTTCC AGCTGCCTCG 1150 
TGGCCTGCCC AGAAGATATT 12 00 
GCCCACTTGA CCGGTGACAC 12 5 0 
GGCCCCACCG GAGCCAGCCG 1300 
GCAGAGAAAC CCTGGCCTAT 1350 
GAGAATGCTG ACGTCACAGC 1400 
CCTTGGCGAG GAGCTGCTGC 1450 
TGGGTGACCT GGTGGACAAG 1500 
GCTGCCACCA CCCGGATAGA 15 50 
CACGGGAGTC AAGCTGGAGG 1600 
GCCTGATGCA GGCCATCAAG 1650 
AAGGAGATAG TGGAGAGTGG 1700 
CGCCAAGAAC TCTCGGTGGA 17 50 
TTGGTTGGGG AGCTACCATC 1800 
GGCAAAGGGA AGTTCGAGGA 1850 
CAGTACTGCC CAGCTCGTGG 19 00 
GCCTCAATCT GACCCAGCTG 2000 
ACAGCCGCTG TGGTGGCCTC 2 050 
AACAGACAGT ATGGACTTCT 2100 
AGGAGATGGA TTCCCAGGTT 2150 
AAGGAGCGTC AGAAACTAGG 2200 
GGGCGTGGCT GAGGGCTGGG 2250 
TCCAAGAAGC AATACCGGAC 23 00 
ATCAGAAA 2338 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 676 

(B) TYPE: protein 

(D) TOPOLOGY: linear 

(ii) MOLECULETYPE: protein 
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(iii) HYPOTHETICAL: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: mouse 
(ix) FEATURE: Huntingtin-interacting protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

Ala Arg Gly Leu lie Gin lie Pro Gin Leu Pro Glu Asn Pro Pro 

5 10 15 

Asn Phe Leu Arg Ala Ser Ala Leu Ser Glu His lie Ser Pro Val 

20 25 30 

Val Val lie Pro Ala Glu Val Ser Ser Pro Asp Ser Glu Pro Val 

35 40 45 

Leu Glu Lys Asp Asp Leu Met Asp Met Asp Ala Ser Gin Gin Thr 

50 55 60 

Leu Phe Asp Asn Lys Phe Asp Asp Val Phe Gly Ser Ser Leu Ser 

65 70 75 

Ser Asp Pro Phe Asn Phe Asn Asn Gin Asn Gly Val Asn Lys Asp 

80 85 90 

Glu Lys Asp His Leu lie Glu Arg Leu Tyr Arg Glu lie Ser Gly 

95 100 105 

Leu Thr Gly Gin Leu Asp Asn Met Lys lie Glu Ser Gin Arg Ala 

110 115 120 

Met Leu Gin Leu Lys Gly Arg Val Ser Glu Leu Glu Ala Glu Leu 

125 130 135 

Ala Glu Gin Gin His Leu Gly Arg Gin Ala Met Asp Asp Cys Glu 

140 145 150 

Phe Leu Arg Thr Glu Leu Asp Glu Leu Lys Arg Gin Arg Glu Asp 

155 160 165 

Thr Glu Lys Ala Gin Arg Ser Leu Thr Glu lie Glu Arg Lys Ala 

170 175 180 

Gin Ala Asn Glu Gin Arg Tyr Ser Lys Leu Lys Glu Lys Tyr Ser 

185 190 195 

Glu Leu Val Gin Asn His Ala Asp Leu Leu Arg Lys Asn Ala Glu 

200 205 210 

Val Thr Lys Gin Val Ser Val Ala Arg Gin Ala Gin Val Asp Leu 

215 220 225 



Glu Arg Glu Lys Lys Glu Leu Ala Asp Ser Phe Ala Arg Val Ser 
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230 235 240 

Asp Gin Ala Gin Arg Lys Thr Gin Glu Gin Gin Asp Val Leu Glu 

245 250 255 

Asn Leu Lys His Glu Leu Ala Thr Ser Arg Gin Glu Leu Gin Val 

260 265 270 

Leu His Ser Asn Leu Glu Thr Ser Ala Gin Ser Glu Ala Lys Trp 

275 288 285 

Leu Thr Gin lie Ala Glu Leu Glu Lys Glu Gin Gly Ser Leu Ala 

290 295 300 

Thr Val Ala Ala Gin Arg Glu Glu Glu Leu Ser Ala Leu Arg Asp 

305 310 315 

Gin Leu Glu Ser Thr Gin lie Lys Leu Ala Gly Ala Gin Glu Ser 

320 325 330 

Met Cys Gin Gin Val Lys Asp Gin Arg Lys Thr Leu Leu Ala Gly 

335 340 345 

lie Arg Lys Ala Ala Glu Arg Glu lie Gin Glu Ala Leu Ser Gin 

350 355 360 

Leu Glu Glu Pro Thr Leu lie Ser Cys Ala Gly Ser Thr Asp His 

365 370 375 

Leu Leu Ser Lys Val Ser Ser Val Ser Ser Cys Leu Glu Gin Leu 

380 385 390 

Glu Lys Asn Gly Ser Gin Tyr Leu Ala Cys Pro Glu Asp lie Ser 

395 400 405 

Glu Leu Leu His Ser lie Thr Leu Leu Ala His Leu Thr Gly Asp 

410 415 420 

Thr Val lie Gin Gly Ser Ala Thr Ser Leu Arg Ala Pro Pro Glu 

425 430 435 

Pro Ala Asp Ser Leu Thr Glu Ala Cys Arg Gin Tyr Gly Arg Glu 

440 445 450 

Thr Leu Ala Tyr Leu Ser Ser Leu Glu Glu Glu Gly Thr Val Glu 

455 460 465 



Asn Ala Asp Val Thr Ala Leu Arg Asn Cys Leu Ser Arg Val Lys 

470 475 480 
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Thr Leu Gly Glu Glu Leu Leu Pro Arg Gly Leu Asp lie Lys Gin 

485 490 495 

Glu Glu Leu Gly Asp Leu Val Asp Lys Glu Met Ala Ala Thr Ser 

500 505 510 

Ala Ala lie Glu Ala Ala Thr Thr Arg lie Glu Glu lie Leu Ser 

515 520 525 

Lys Ser Arg Ala Gly Asp Thr Gly Val Lys Leu Glu Val Asn Glu 

530 535 540 

Arg lie Leu Gly Ser Cys Thr Ser Leu Met Gin Ala lie Lys Val 

545 550 555 

Leu Val Val Ala Ser Lys Asp Leu Gin Lys Glu lie Val Glu Ser 

560 565 570 

Gly Arg Gly Ser Ala Ser Pro Lys Glu Phe Tyr Ala Lys Asn Ser 

575 588 585 

Arg Trp Thr Glu Gly Leu lie Ser Ala Ser Lys Ala Val Gly Trp 

590 595 600 

Gly Ala Thr lie Met Val Asp Ala Ala Asp Leu Val Val Gin Gly 

605 610 615 

Lys Gly Lys Phe Glu Glu Leu Met Val Cys Ser Arg Glu lie Ala 

620 625 630 

Ala Ser Thr Ala Gin Leu Val Ala Ala Ser Lys Val Lys Ala Asn 

635 640 645 

Lys Gly Ser Leu Asn Leu Thr Gin Leu Gin Gin Ala Ser Arg Gly 

650 655 660 

Val Asn Gin Ala Thr Ala Ala Val Val Ala Ser Thr lie Ser Gly 

665 670 675 

Lys Ser Gin lie Glu Glu Thr Asp Ser Met Asp Phe Ser Ser Met 

680 685 690 

Thr Leu Thr Gin lie Lys Arg Gin Glu Met Asp Ser Gin Val Arg 

695 700 705 

Val Leu Glu Leu Glu Asn Asp Leu Gin Lys Glu Arg Gin Lys Leu 



710 



715 



720 



Gly Glu Leu Arg Lys Lys His Tyr Glu Leu Glu Gly Val Ala Glu 

725 730 735 
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Gly Trp Glu Glu Gly Thr Glu Ala Ser Pro Ser Thr Val Gin Glu 

740 745 750 

Ala lie Pro Asp Lys Glu 

755 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3964 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: mouse 

(ix) FEATURE: cDNA for Huntingtin-interacting protein - mHIPla 
(xi)SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

GGCACGAGGC GGCGCGCGGC CTCCGTGTGC CTAGGCTTGA GGCGGGCGGT 50 

GACGCCTCAT TCGCGCGGAG CCGGGCCGGG ACACGGTCGG CGGCAGCATG 100 

AACAGCATCA AGAATGTGCC GGCGCGGGTG CTGAGCCGCA GGCCGGGCCA 150 

CAGCCTAGAG GCCGAGCGCG AGCAGTTCGA CAAGACGCAG GCCATCAGTA 2 00 

TCAGCAAAGC CATCAACAGC CAGGAGGCCC CAGTGAAGGA GAAGCATGCC 2 50 

CGGCGTATCA TCCTGGGCAC GCATCATGAG AAGGGAGCCT TCACCTTCTG 300 

GTCCTATGCC ATCGGCCTGC CGCTGTCCAG CAGCTCCATC CTCAGCTGGA 3 50 

AGTTCTGTCA CGTCCTTCAC AAGGTCCTCC GGGACGGACA CCCCAACGTC 400 

CTGCATGACT ATCAGCGGTA CCGGAGCAAC ATACGTGAGA TCGGTGACTT 450 

GTGGGGCCAC CTTCGTGACC AGTATGGACA CCTGGTGAAT ATCTATACCA 500 

AACTGTTGCT GACTAAGATC TCCTTCCACC TTAAGCACCC CCAGTTTCCT 55 0 

GCAGGCCTGG AGGTAACAGA TGAGGTGTTG GAGAAGGCGG CGGGAACTGA 600 

TGTCAACAAC ATTTTTCAGC TTACCGTGGA GATGTTTGAC TACATGGACT 650 

GTGAACTGAA GCTTTCTGAG TCAGTTTTCC GGCAGCTCAA CACGGCCATC 7 00 

GCAGTGTCCC AGATGTCTTC TGGCCAGTGT CGCCTAGCGC CGCTCATCCA 7 50 

GGTCATTCAG GACTGCAGCC ACCTGTACCA CTACACAGTG AAGCTCATGT 800 

TTAAGCTGCA CTCCTGTCTC CCGGCAGACA CCCTGCAAGG CCACAGGGAT 850 

CGGTTCCACG AGCAGTTCGA CAGCCTCAAA AACTTCTTCC GCCGGGCTTC 900 
AGACATGCTG TACTTC/^GA GGCTCATCCA GATCCCGCGG CTGCCTGAGG 950 

GACCCCCCAA TTTCCTGCGG GCTTCAGCCC TGGCTGAGCA CATCAAGCCG 1000 

GTGGTGGTGA TTCCCGAGGA GGCCCCAGAG GAAGAGGAGC CTGAGAACCT 1050 

AATTGAAATC AGCAGTGCGC CCCCTGCTGG GGAGCCAGTG GTGGTGGCTG 1100 

ACCTCTTTGA TCAGACCTTT GGACCCCCCA ATGGCTCCAT GAAGGATGAC 1150 

AGGGACCTCC AAATCGAGAA CTTGAAGAGA GAGGTGGAGA CCCTCCGTGC 12 00 

TGAGCTGGAG T^GATTAAGA TGGAGGCACA GCGGTACATC TCCCAGCTGA 1250 

AfiGGCCAGGT GAATGGCCTG GAGGCAGAGC TGGAGGAGCA GCGCAAGCAG 1300 

AAGCAGAAGG CCCTGGTGGA CAACGAGCAG CTGCGCCACG AGCTGGCCCA 1350 

GCTCAAGGCC CTGCAGCTGG AGGGCGCCCG CAACCAGGGC CTTCGAGAGG 1400 

AAGCAGAGAG GAAGGCCAGT GCCACGGAGG CACGCTACAG CAAGCTGAAG 1450 

GAGAAACACA GCGAACTCAT TAACACGCAC GCCGAGCTGC TCAGGAAGAA 1500 
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CGCAGACACG GCCAAGCAGC TGACAGTGAC ACAGCAGAGC CAGGAGGAGG 1550 
TGGCACGGGT AAAGGAACAG CTGGCCTTCC AGATGGAGCA AGCGAAGCGT 1600 
GAGTCTGAGA TGAAGATGGA AGAGCAGAGC GACCAGTTGG AGAAGCTCAA 1650 
GAGGGAGCTG GCGGCCAGGG CAGGAGAGCT GGCCCGTGCG CAGGAGGCCC 17 00 
TGAGCCGCAC AGAACAGAGT GGGTCAGAGC TGAGCTCACG GCTGGACACA 1750 
CTGAACGCGG AGAAGGAAGC CCTGAGTGGA GTCGTTCGGC AGCGTGAGGC 1800 
AGAGCTGCTG GCCGCTCAGA GCCTGGTGCG GGAGAAGGAG GAGGCGCTTA 1850 
GCCAAGAGCA GCAGCGGAGC TCCCAGGAGA AGGGCGAGCT ACGGGGGCAG 1900 
CTGGCAGAAA AGGAGTCTCA GGAGCAGGGG CTTCGGCAGA AGCTGCTGGA 1950 
TGAGCAGTTG GCGGTGTTGC GAAGTGCAGC CGCCGAGGCA GAGGCCATCC 2000 
TACAGGATGC AGTGAGCAAG CTGGACGACC CCCTGCACCT CCGCTGCACC 2050 
AGCTCCCCAG ACTACTTGGT GAGCCGGGCT CAGGCAGCCC TGGACAGCGT 2100 
GAGCGGCCTG GAGCAGGGCC ACACCCAGTA CCTGGCTTCC TCCGAAGATG 2150 
CTTCTGCCCT GGTGGCAGCG CTGACCCGCT TCTCCCATTT GGCTGCGGAC 2200 
ACCATTGTCA ATGGTGCCGC CACCTCCCAC CTGGCCCCCA CCGACCCCGC 2250 
CGACCGCCTG ATGGACACAT GCAGGGAGTG TGGAGCCCGG GCTCTGGAGC 2 3 00 
TGGTGGGACA GCTGCAAGAC CAGACAGTGC TACGGAGGGC TCAGCCCAGC 2 3 50 
CTGATGCGGG CCCCCCTGCA GGGCATTCTG CAGTTGGGCC AGGACTTGAA 24 00 
GCCTAAGAGC CTGGATGTAC GGCAAGAGGA GCTAGGGGCC ATGGTGGACA 2450 
AGGAGATGGC GGCCACCTCG GCAGCCATTG AGGACGCTGT GCGGAGGATC 2500 
GAGGACATGA TGAGCCAGGC CCGCCACGAG AGCTCAGGCG TGAAACTGGA 2550 
GGTGAATGAG AGGATCCTCA ACTCCTGCAC AGACCTGATG AAGGCTATCC 2600 
GGCTCCTGGT GATGACCTCC ACCAGCCTGC AGAAGGAAAT TGTGGAGAGC 2650 
GGCAGGGGGG CAGCAACGCA GCAGGAATTT TATGCCAAGA ATTCACGGTG 27 00 
GACTGAAGGC CTCATCTCAG CCTCTAAGGC AGTGGGCTGG GGAGCCACAC 27 50 
AGCTGGTGGA GTCAGCTGAC AAGGTTGTGC TTCACATGGG CAAATACGAG 2800 
GAACTCATCG TCTGCTCCCA TGAGATTGCG GCCAGCACGG CCCAGCTGGT 2850 
GGCAGCCTCG AAGGTGAAAG CCAACAAGAA CAGTCCCCAC TTGAGCCGCC 2900 
TGCAGGAATG TTCCCGCACT GTCAACGAGA GGGCTGCCAA CGTCGTGGCC 2950 
TCCACCAAAT CTGGCCAGGA GCAGATTGAG GACAGAGACA CCATGGATTT 3000 
CTCTGGCCTG TCCCTCATCA AGTTGAAGAA GCAGGAGATG GAGACACAGG 3050 
TGCGAGTCTT GGAGCTGGAG AAGACACTAG AGGCAGAGCG TGTCCGGCTC 3100 
GGGGAGCTTC GGAAACAGCA CTATGTACTG GCTGGGGGGA TGGGAACACC 3150 
TAGCGAAGAA GAACCCAGCA GACCCAGCCC AGCTCCCCGA AGTGGGGCCA 32 00 
CTAAGAAGCC ACCGCTGGCC CAGAAACCCA GCATAGCCCC CAGGACAGAC 32 50 
AACCAGCTCGA CAAAAAGGAT GGTGTCTACC CAGCTCAACT TGTGAACTAC 3300 
TAGGCCCCTAA GGTGTTCAGC AGGATGGCTG GTGGTTGTGC CTGGGCTTCA 3 3 50 
TGTGGCTGTCT GGCAGTGGTC AAGGGGCCTC TGAGAAGCCT CCAACTCCTG 3 400 
CCCAAGGGGCC TAGTCTGTGG GACAGTTCAT CTGGATGTGA ATCTATTTAT 3450 
CTTAAGTAGGA ACTGCCTCGA GCAGCTGGGA CCCAGCAGGC CTGAGCCACA 3500 
AATCTGCAGCG GACATCAGAG ATAGTCTGAA TGCTGCGAGG TATTTCTTTC 3 550 
TTCGTAAGTTT AGTCAGCACA CTGGGAAAAG GTCACATAAG CCAGGAGCCT 3600 
CCTTGTCTCTG GACTCAAAAG TCTGAGGCCT TAAGTGAACA ACAGAAAGAG 3 650 
GGTCCCTGCTG GCTACCAGGG ATAAGGGGAT GACCTGTGAC CCTTGAGCCA 37 00 
GGGAGAGCAGG TAAGCTGGGT GGTGTCATCA CCTGGGGGCC TGGTGCTAGG 37 50 
GCATCCATGCT GGGAGCCCCA GGAGACCAGG CTTTGTGTGG GAGCCTGGCA 3800 
TCATCGTGGCT GGGGCAGCCC CTGCTCAGGT GCTGTCTCTG CCCGTGACCT 3850 
TGAAGCCACCC TCCCCCCGTA CAGTTTTCCA TTCTCCTGGC TACTAGTGTG 3900 
GCTGTTCATTG CCTACCTTGA TGAGTAGATT TCAGCCCTCC TAAAGCTGGG 3950 
GCCTTTCCTCG TGCC 3964 
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(2) INFORMATION FOR SEQ ED NO: 1 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 676 

(B) TYPE: protein 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: mouse 

(ix) FEATURE: Huntingtin-interacting protein -mHEPla 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

Met Asn Ser lie Lys Asn Val Pro Ala Arg Val Leu Ser Arg Arg 

5 10 15 

Pro Gly His Ser Leu Glu Ala Glu Arg Glu Gin Phe Asp Lys Thr 

20 25 30 

Gin Ala lie Ser lie Ser Lys Ala lie Asn Ser Gin Glu Ala Pro 

35 40 45 

Val Lys Glu Lys His Ala Arg Arg lie lie Leu Gly Thr His His 

50 55 60 

Glu Lys Gly Ala Phe Thr Phe Trp Ser Tyr Ala lie Gly Leu Pro 

65 70 75 

Leu Ser Ser Ser Ser lie Leu Ser Trp Lys Phe Cys His Val Leu 

80 85 90 

His Lys Val Leu Arg Asp Gly His Pro Asn Val Leu His Asp Tyr 

95 100 105 

Gin Arg Tyr Arg Ser Asn lie Arg Glu lie Gly Asp Leu Trp Gly 

110 115 120 

His Leu Arg Asp Gin Tyr Gly His Leu Val Asn lie Tyr Thr Lys 

125 130 135 

Leu Leu Leu Thr Lys lie Ser Phe His Leu Lys His Pro Gin Phe 

140 145 150 

Pro Ala Gly Leu Glu Val Thr Asp Glu Val Leu Glu Lys Ala Ala 

155 160 165 

Gly Thr Asp Val Asn Asn lie Phe Gin Leu Thr Val Glu Met Phe 

170 175 180 



Asp Tyr Met Asp Cys Glu Leu Lys Leu Ser Glu Ser Val Phe Arg 
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185 190 195 

Gin Leu Asn Thr Ala lie Ala Val Ser Gin Met Ser Ser Gly Gin 

200 205 210 

Cys Arg Leu Ala Pro Leu lie Gin Val lie Gin Asp Cys Ser His 

215 220 225 

Leu Tyr His Tyr Thr Val Lys Leu Met Phe Lys Leu His Ser Cys 

230 235 240 

Leu Pro Ala Asp Thr Leu Gin Gly His Arg Asp Arg Phe His Glu 

245 250 255 

Gin Phe His Ser Leu Lys Asn Phe Phe Arg Arg Ala Ser Asp Met 

260 265 270 

Leu Tyr Phe Lys Arg Leu lie Gin lie Pro Arg Leu Pro Glu Gly 

275 288 285 

Pro Pro Asn Phe Leu Arg Ala Ser Ala Leu Ala Glu His lie Lys 

290 295 300 

Pro Val Val Val He Pro Glu Glu Ala Pro Glu Glu Glu Glu Pro 

305 310 315 

Glu Asn Leu He Glu He Ser Ser Ala Pro Pro Ala Gly Glu Pro 

320 325 330 

Val Val Val Ala Asp Leu Phe Asp Gin Thr Phe Gly Pro Pro Asn 

335 340 345 

Gly Ser Met Lys Asp Asp Arg Asp Leu Gin He Glu Asn Leu Lys 

350 355 360 

Arg Glu Val Glu Thr Leu Arg Ala Glu Leu Glu Lys He Lys Met 

365 370 375 

Glu Ala Gin Arg Tyr He Ser Gin Leu Lys Gly Gin Val Asn Gly 

380 385 390 

Leu Glu Ala Glu Leu Glu Glu Gin Arg Lys Gin Lys Gin Lys Ala 

395 400 405 

Leu Val Asp Asn Glu Gin Leu Arg His Glu Leu Ala Gin Leu Lys 

410 415 420 

Ala Leu Gin Leu Glu Gly Ala Arg Asn Gin Gly Leu Arg Glu Glu 

425 430 435 



Ala Glu Arg Lys Ala Ser Ala Thr Glu Ala Arg Tyr Ser Lys Leu 
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440 



445 



450 



Lys Glu Lys His Ser Glu Leu lie Asn Thr His Ala Glu Leu Leu 

455 460 465 

Arg Lys Asn Ala Asp Thr Ala Lys Gin Leu Thr Val Thr Gin Gin 

470 475 480 

Ser Gin Glu Glu Val Ala Arg Val Lys Glu Gin Leu Ala Phe Gin 

485 490 495 

Met Glu Gin Ala Lys Arg Glu Ser Glu Met Lys Met Glu Glu Gin 

500 505 510 

Ser Asp Gin Leu Glu Lys Leu Lys Arg Glu Leu Ala Ala Arg Ala 

515 520 525 

Gly Glu Leu Ala Arg Ala Gin Glu Ala Leu Ser Arg Thr Glu Gin 

530 535 540 

Ser Gly Ser Glu Leu Ser Ser Arg Leu Asp Thr Leu Asn Ala Glu 

545 550 555 

Lys Glu Ala Leu Ser Gly Val Val Arg Gin Arg Glu Ala Glu Leu 

560 565 570 

Leu Ala Ala Gin Ser Leu Val Arg Glu Lys Glu Glu Ala Leu Ser 

575 588 585 

Gin Glu Gin Gin Arg Ser Ser Gin Glu Lys Gly Glu Leu Arg Gly 

590 595 600 

Gin Leu Ala Glu Lys Glu Ser Gin Glu Gin Gly Leu Arg Gin Lys 

605 610 615 

Leu Leu Asp Glu Gin Leu Ala Val Leu Arg Ser Ala Ala Ala Glu 

620 625 630 

Ala Glu Ala lie Leu Gin Asp Ala Val Ser Lys Leu Asp Asp Pro 

635 640 645 

Leu His Leu Arg Cys Thr Ser Ser Pro Asp Tyr Leu Val Ser Arg 

650 655 660 

Ala Gin Ala Ala Leu Asp Ser Val Ser Gly Leu Glu Gin Gly His 

665 670 675 

Thr Gin Tyr Leu Ala Ser Ser Glu Asp Ala Ser Ala Leu Val Ala 

680 685 690 

Ala Leu Thr Arg Phe Ser His Leu Ala Ala Asp Thr He Val Asn 
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695 



700 



705 



Gly Ala Ala Thr Ser His Leu Ala Pro Thr Asp Pro Ala Asp Arg 

710 715 720 

Leu Met Asp Thr Cys Arg Glu Cys Gly Ala Arg Ala Leu Glu Leu 

725 730 735 

Val Gly Gin Leu Gin Asp Gin Thr Val Leu Arg Arg Ala Gin Pro 

740 745 750 

Ser Leu Met Arg Ala Pro Leu Gin Gly lie Leu Gin Leu Gly Gin 

755 760 765 

Asp Leu Lys Pro Lys Ser Leu Asp Val Arg Gin Glu Glu Leu Gly 

770 775 780 

Ala Met Val Asp Lys Glu Met Ala Ala Thr Ser Ala Ala lie Glu 

785 790 795 

Asp Ala Val Arg Arg lie Glu Asp Met Met Ser Gin Ala Arg His 

800 805 810 

Glu Ser Ser Gly Val Lys Leu Glu Val Asn Glu Arg lie Leu Asn 

815 820 825 

Ser Cys Thr Asp Leu Met Lys Ala lie Arg Leu Leu Val Met Thr 

830 835 840 

Ser Thr Ser Leu Gin Lys Glu lie Val Glu Ser Gly Arg Gly Ala 

845 850 855 

Ala Thr Gin Gin Glu Phe Tyr Ala Lys Asn Ser Arg Trp Thr Glu 

860 865 870 

Gly Leu lie Ser Ala Ser Lys Ala Val Gly Trp Gly Ala Thr Gin 

875 888 885 

Leu Val Glu Ser Ala Asp Lys Val Val Leu His Met Gly Lys Tyr 

890 895 900 

Glu Glu Leu lie Val Cys Ser His Glu lie Ala Ala Ser Thr Ala 

905 910 915 

Gin Leu Val Ala Ala Ser Lys Val Lys Ala Asn Lys Asn Ser Pro 



His Leu Ser Arg Leu Gin Glu Cys Ser Arg Thr Val Asn Glu Arg 



920 



925 



930 



935 



940 



945 



Ala Ala Asn Val Val Ala Ser Thr Lys Ser Gly Gin Glu Gin lie 
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960 



Glu Asp Arg Asp Thr Met Asp Phe Ser Gly Leu Ser Leu lie Lys 



965 



970 



975 



Leu Lys Lys Gin Glu Met Glu Thr Gin Val Arg Val Leu Glu Leu 



980 



985 



990 



Glu Lys Thr Leu Glu Ala Glu Arg Val Arg Leu Gly Glu Leu Arg 



995 



1100 



1105 



Lys Gin His Tyr Val Leu Ala Gly Gly Met Gly Thr Pro Ser Glu 



1110 



1115 



1120 



Glu Glu Pro Ser Arg Pro Ser Pro Ala Pro Arg Ser Gly Ala Thr 



1125 



1130 



1135 



Lys Lys Pro Pro Leu Ala Gin Lys Pro Ser He Ala Pro Arg Thr 



1140 



1145 



1150 



Asp Asn Gin Leu Asp Lys Lys Asp Gly Val Tyr Pro Ala Gin Leu 



1155 



1160 



1165 



Val Asn Tyr 



(2) INFORMATION FOR SEQ ID NO: 12: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(xi)SEQUENCE DESCRIPTION: SEQ ED NO: 12: 
GAAGATACCC CACCAAAC 18 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other DNA 

(iii) HYPOTHETICAL: no 



# 
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(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(xi)SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
GCTTGACAGT GTAGTCATAA AGGTGGCTGC AGTCC 35 



(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
P) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(xi)SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
GGACATGTCC AGGGAGTTGA ATAC 24 



(2) INFORMATION FOR SEQ ID NO: 15: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 41 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: yes 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(xi)SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

CUACUACUAC UACUAGGCCA CGCGTCGACT AGTACGGGH GGGUGGGH G 41 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 516 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 
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(x) FEATURE: exon 1 of HIPl 

(xi) SEQUENCE DESCRIPTION: SEQ ED NO; 16: 

TCTGTGGAAG GTTTGGAGGG GAGAGAGGGG CAGCTGGATG CTCTTGGGCC ACGGTCGCCC 60 

CTGATCTCTG CGCCTCTTCC TCCTGCTCCG GGAGAAATAA TGTTTCCCTG GGGGATGAAA 120 

GCATCTCTTT GTGCGGGCTT TAATTGCCAT GTTGTTGTGC CAAGGGAGTG AGTGGCGGCG 180 

GGACCAGCAG CTGGGCACAG CCAATGCCAG GCAGTGGTGC CCACTCCCTC AGGACGCCCA 240 

GCCAGCTGGC TCCTGGGAGC GCTGCCCACC TCTGCCCCCA GCTGGGCGCC TGCAAGGAAC 300 

CGACCACCCG TGGGGCTGGG GGAGGTTGGC TGGAGGAGGA GAAAGGGGCG GGCTCTGGGA 360 

GGGTCTCAGC CACTCTCAGA GGCTTATTCA TCTCATCCTC CTTTCCCTCC CCCTTCTTGT 420 

TTTTCAGACT GTCAGCATCA ATAAGGCCAT TAATACGCAG GAAGTGGCTG TAAAGGAAAA 480 

ACACGCCAGA AATATCCTTT GGATGTTGCT TGGAAG 51 



(2) INFORMATION FOR SEQ ID NO: 17: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 193 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 2 of HIPl 

(xi) SEQUENCE DESCRIPTION: SEQ ED NO: 17: 

TGTTTTCCAT AACCCCCCCT CACCGTGCAT ACTGGGCACC 
GACCTTCTGG TCTGTTGTCA ACCGCCTGCC TCTGTCTAGC 
GTTCTGCCAT GTGTTCCACA AACTCCTCCG AGATGGACAC 
CTATGGGGTG GCA 

(2) INFORMATION FOR SEQ ID NO: 18: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 104 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 3 of HIPl 

(xi) SEQUENCE DESCRIPTION: SEQ ED NO: 18: 

GTGTTCTTTT GCCCCTGCAG GTCCTGAAGG ACTCTCTGAG ATACAGAAAT GAATTGAGTG 60 
ACATGAGCAG GATGTGGGTG AGTTTGGAGA TGTACTCAGG AGCC 104 

(2) INFORMATION FOR SEQ ID NO:20: 
(i) SEQUENCE CHARACTERISTICS: 



CACCATGAGA AAGGGGCACA 60 
AACCCAGTGC TCTGCTGGAA 120 
CCGAACGTGA GTTCCTGGGG 180 

193 
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(A) LENGTH: 327 

(B) TYPE: nucleic acid 

(C) STRANDEDNfESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 4 of HIPl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

AATTCCTGGC TGCAGATCTC TTGACTGTTA TGTTCTTGTT GTTGACTCTG TTTCCCCTCC 60 
TCTTCCTAAA AGGGCCACCT GAGCGAGGGG TATGGCCAGC TGTGCAGCAT CTACCTGAAA 120 
CTGCTAAGAA CCAAGATGGA GTACCACACC AAAGTGAGTC TCTGCGGACA GTTCTGCCGC 180 
CACCGCCGCC TCCCCTGCTC CATCCCTTCA GCCCCTCCCT GGGCTCATTT GTCAGCTCTT 240 
TCAGGTAATA GACAGCCCAG GCTTCTGAGG AAGTGTGCAC ATCATGTACC CAAGCTGTGA 300 
GAGAGGAAAG CCACCGCCAG GCCCACG 327 



(2) INFORMATION FOR SEQ ID NO:21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 331 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 5 of HIP I 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

GGGCTCAAGC AATCCTCCCA CCTCGGCCTC CCAAGTAGCT GGGACCACAG GCGTGTGCCA 60 
CCACGCCCGG CTGAGAGAGG GCTCTTCATG TCTTCTGCCC TGACTCCCTT CCTCTGCCTC 120 
CCTTCCAGAA TCCCAGGTTC CCAGGCAACC TGCAGATGAG TGACCGCCAG CTGGACGAGG 180 
CTGGAGAAAG TGACGTGAAC AACTTGTAAG TGGCTCCTGC CCTGAGCCCA GGGAGGGAGA 240 
AAGCTTTTGT GAATGCTGAC ACTTCTCATA AGGGTCATGG AGGGCCTGAT GGGGGGAGGC 300 
CGTGGCTGGG ATGGGGACCA AAGCCCCTGG G 331 



(2) INFORMATION FOR SEQ ID NO:22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 470 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
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(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 6 of HIP 1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

ACTGTCGCTG TCACTGTTGA CTTCACCAGG CTGCATGGCC ATAATACCCA CAAGGCTAAG 
ACTTGGAGCT GGAGTTGTGT GTGTGTTTGC GCATGCACAT GAGCATTGGA GACTGGAGTA 
GCGTAGAGCG TGGGGGAGGG GACAGGTAAC AGACCGGCCT CAGGCTGTGG AGTGTAAGCT 
CTCTTTCCTC TTGGGTCCAG TTTCCAGTTA ACAGTGGAGA TGTTTGACTA CCTGGAGTGT 
GAACTCAACC TCTTCCAAAC AGGTGAGTCT CTTCCCTCCC GTCTAACCCA GGCTCTCATG 
GGAACTACCT AATTCCTAGT CCTCCTCTCC CTGCAAAGTG TGCAGCACAA GGGGTAGGAA 
AATGGAGACA TTCACACCCC ATCTCTGGTC TCTCCAACCC TCGTGCAGGG AGGGACTGAA 
CCTCTTCAGT ATTTTTCTTT TTAAGAGACA AGGTCTCGGC CGGGTGCAGT 

(2) INFORMATION FOR SEQ ID NO:23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 565 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULETYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 7 of HIPI 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

TCTTCACCTG TTTAATGGGG ATACGTTTAC CTATCTCATG GGAGTGTTGT GAAGGTTAAA 

TGAATTAGAT GAGGTAAAGC ACGCACAGAA TCGGTCCTTG GTGTATGTTG GACCCCTGCC 120 

TCTGCCCCTC TGAAGAGGCT GCCTGTAATC CCCTGGCTCT ACCACCTTTC TCCCTCACTT 180 

TTATTTCCTA GTATTCAACT CCCTGGACAT GTCCCGCTCT GTGTCCGTGA CGGCAGCAGG 240 

GCAGTGCCGC CTCGCCCCGC TGATCCAGGT CATCTTGGAC TGCAGCCACC TTTATGACTA 300 

CACTGTCAAG CTTCTCTTCA AACTCCACTC CTGTGAGTAC CGCGGGCCAG ATCTTCTTAC 360 

ATGAGATTCA GGCCAGAGGG AGGATCCCAG CCTGAGGATG TCCCCAGAGA AACGCAGTCC 420 

TTCTCAGTGC CTTTGGCTGT CTGCTTCTGT TCCAAAAGGC CCCGGAGCTT CTGACCATTG 480 

TGAGGATAAA AGAGCAGGGC CCAGGCTTTG GTGACCCCAG TAAAGCCCCT GGCTTGCCAC 540 
TCTTGCGTCC AGTGTTACAG GATCT 



60 
120 
180 
240 
300 
360 
420 
470 



(2) INFORMATION FOR SEQ ID NO:24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 233 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULETYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 8 of HIPI 
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(xi)SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

GGGACAGCTC TAGGCCAGTC GTGGCCCCTG GCAGTGCTGG CCACATGCCC CAGGGTAGCT 60 

GGGCCCCTCC CCCTCGAGAG CCCCGCTGTG GCTTCCCTGC CCTCTGGTCC CCCTCCCCTC 120 

TCACACTCTT TCCAATTTCT TCCAGGCCTC CCAGCTGACA CCCTGCAAGG CCACCGGGAC 180 

CGCTTCATGG AGCAGTTTAC AAAGTAAGTG GTTCAAGTAA CAGGAATGGA GGT 233 

(2) INFORMATION FOR SEQ ID NO:25: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 578 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULETYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exons 9 and 10 of HIPI 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

TGAATCCCAG CACCATGGAG TTTATCTCCT TGACAGCCTG TGCCTTTGGG CTGGGGAGGG 60 

GGCAGGAAAG CCAGGTGGCT GCTCTGTCCC CTACATGGGG CTGATGAAGA CACCCAGCAC 120 

CCCTCAGGTC CTTCTCCACC CCTAGGTTGA AAGATCTGTT CTACCGCTCC AGCAACCTGC 180 

AGTACTTCAA GCGGCTCATT CAGATCCCCC AGCTGCCTGA GGTAAGCATG CCCAACCACA 240 

CACCCTCGGC ACTGCAGAGG CCCCAGGTAC TCTCTTAAGG GCCGGCGGGG CCTGGCAAGC 300 

AAGCACTATT TGAGGATGTG TCTCCGTCTT CAGAACCCAC CCAACTTCCT GCGAGCCTCA 360 

GCCCTGTCAG AACATATCAG CCCTGTGGTG GTGATCCCTG CAGAGGCCTC ATCCCCCGAC 420 

AGCGAGCCAG TCCTAGAGAA GGATGACCTC ATGGACATGG ATGCCTCTCA GCAGGTGAGG 480 

ACCACTTGGG AGAGAAACTT GGCCTTTCCT CTCACCTGCA AGTACAGGGG AGAGGCTGGG 540 

GGAGACCCTG GCCAAAGCCC ATTGACTCTA ACCAGGTT 578 

(2) INFORMATION FOR SEQ ID NO:26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 390 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 11 of HIPI 

(xi) SEQUENCE DESCRIPTION: SEQ ED NO: 26: 

AAAAAAATTT AAAAAATTAA ACAGGTCTGA ACCGTTTAAT TCGAGAAAGG GGGCATTCTC 60 

CCATATCACT CAACTGACCC ACACACAGAA TTCTCTGGCT CTCTGACTTA TTCTCACTCC 120 

TTTTTGGTCA ACCACAGAAT TTATTTGACA ACAAGTTTGA TGACATCTTT GGCAGTTCAT 180 

TCAGCAGTGA TCCCTTCAAT TTCAACAGTC AAAATGGTGT GAACAAGGAT GAGAAGTGAG 240 

TCCAAGCTGG GTTCAAGCAG ATGGTTCAGG AGCTAAGTTA AGCCATGGTC TGCCTCAAAA 300 

CACTAACCAA AGAGGAATTC TTAATGATAC TGGGGCTTCT TAGATACAGA ACATCTTGAA 360 

GGGTTGGGGG CAATGGCTTA TGCCTGTAAT 390 
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(2) INFORMATION FOR SEQ ID NO:27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 547 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 12 of HIPl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

AAAATCAATA ACCATGGATT TATGAGTATT AGATTAGTAT CTGGTAACAT TTAGAGTATA 60 
ATTTATGGCA TTTCAAAGAA TTGTCCCCAA ATTAATACCA GCTTTTAATT TCCTCCCCTG 120 
AGCTCACAAT TAAAAACAGA GGGATAGAAG CACTATGAAA GCAAACTCAT TCCCCTTCTC 180 
TTCCCAGGGA CCACTTAATT GAGCGACTAT ACAGAGAGAT CAGTGGATTG AAGGCACAGC 240 
TAGAAAACAT GAAGACTGAG GTATAACTTG GATCTGCTCT GCCTTTGCGC TTCACCAAAft 300 
CACGGTAGAT TTGAATGTTA AATTTGCATC ACACTAGCCA GGCACAGTGG CTCACACCTG 360 
TAATCCTAGC ACTTTGGGAG GCCAAGGCAG GAGGATTACC TGAGGTCGGG AGTTCGAGAC 420 
CAGCCTGGGC AACAGGGTGA AACCCCCGTC TTCAATAAAA ATGCAATAAT TAGCCGGGTG 480 
TGTTGGCAGG CACCTGTAAT CCCAGCTACT CGGGAAGCTG AGGCATGAGA ATTGCTTGAA 540 

547 

CTTGGGA 



(2) INFORMATION FOR SEQ ID NO:28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 436 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 13 of HIPl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 

CCCCCAGCCA CTCTAAAGAG GACCACAATT CCCCGGCCAT CATCCCCTGT TATTGTTGTT 60 
GATTGAGGGG CTCCTAATGA CCAGATGGTC CAACCCTCCT GGGACGTGGA GAGTTGACTT 120 
AGGGGAATCA GGTATTTACT TGGAAGCATG GTAGGACCCG CTTCTCCGGC CCATGCCCGT 180 
GACCCGTGGC AGTGGGCGGT TGGCCTCATG ACCGGAGTCC CCCCACAGAG CCAGCGGGTT 240 
GTGCTGCAGC TGAAGGGCCA CGTCAGCGAG CTGGAAGCAG ATCTGGCCGA GCAGCAGCAC 300 
CTGCGGCAGC AGGCGGCCGA CGACTGTGAA TTCCTGCGGG CAGAACTGGA CGAGCTCAGG 360 
AGGCAGCGGG AGGACACCGA GAAGGCTCAG CGGAGCCTGT CTGAGATAGA AAGTGAGCGG 420 
TGGGTGGGGG CGGGGG ^-^^ 



(2) INFORMATION FOR SEQ ID NO:29: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 469 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 14 of HIPl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

GACTTGAGCC CAAGGAGGTC AAGGCTGCAG TGAACAGTGA TTGTGCCACT GCACCCCAGC 50 

CTGGGTGACA GAGCAAGACT GTCTCAAAAC AAAACAAGGA GGACCTTCTA GGGACCCTGG 120 

CTCATTGCAA GGAAGGCAAG GGTCCCTGCT AGGTTAGACT CCTCACCTTG GTCCTTTACA 180 

ATACAGGGAA AGCTCAAGCC AATGAACAGC GATATAGCAA GCTAAAGGAG AAGTACAGCG 240 

AGCTGGTTCA GAACCACGCT GACCTGCTGC GGAAGGTAAG ACCCTCAGCC CCTGTCACCA 300 

TCCTGCAGGC CCTGCACCTC TAGGGAGAGA GCGGCTCAGG CCTGTGGCTT CCCCGGGGCC 360 

AGCAACCCCT ACATTGATCT CTAAGGCATT GCCGTCATCT CGGGAACCAC ACCTTTTCAG 420 

GCTTCCTTGC CTCTGTGTCT TGGGCTGTGT CCTGGGTGCC AATCCCATG 469 



(2) INFORMATION FOR SEQ ID NO:30: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 359 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 15 of HIPl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

GGGTAGGAAA GTGATTCCTG TGTCTGACTC TAGGGCACGC ACAGCCTGAG TATGATTGTC 60 
CTAGAAGGAG GATGTCCTCT AAGCCTGGGA TCTCCTGGTT CAAGACACTG TTCTTCTTTT 120 
GCAGAATGCA GAGGTGACCA AACAGGTGTC CATGGCCAGA CAAGCCCAGG TAGATTTGGA 180 
ACGAGAGAAA AAAGAGCTGG AGGATTCGTT GGAGCGCATC AGTGACCAGG GCCAGCGGAA 240 
GGTGAGTGGG ACGAGGAGCA CTCGGGAAAT GAGGGAGGGG GCTGTTGAGT TGGTGGCGGG 300 
GGCTTTGTGG CCTTCTGCTC CATGGGCAGT TCTGTGGGTC GGTTGGCATC ACACAGCAG 359 

(2) INFORMATION FOR SEQ ID N0:31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 209 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 
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(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 16 of HIPl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31 : 

GTTGATCGCT TGGGACGTTT TTACATTTTT ATATTCTTTG TCACTGTCAC CCAGATCAGA 
GTCCCTCTGT TTTTCTTCTC TTTCAGACTC AAGAACAGCT GGAAGTTCTA GAGAGCTTGA 
AGCAGGAACT TGCCACAAGC CAACGGGAGC TTCAGGTTCT GCAAGGCAGC CTGGAAACTT 
CTGCCCAGGT AAATACCTCC tTTTTTTTT 



(2) INFORMATION FOR SEQ ID NO:32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 485 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 17 of HIPl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

CCCCCACTGC AATCAGTGTG TCCCCGGGAG GGAATCAGAG TGGCAGGTTA AAGAGCCATC 
ACCTTCCCAG TCCTTGCAAC CCGGTGGTGG GTTGGACCTC TGGGAAGTAG GGACTGTTTA 
ACTCAACCAG CGTCTCCCTC TTTCCTTGTG GTCACCTTTG CAGTCAGAAG CAAACTGGGC 
AGCCGAGTTC GCCGAGCTAG AG^AGGAGCG GGACAGCCTG GTGAGTGGCG CAGCTCATAG 
GGAGGAGGAA TTATCTGCTC TTCGGAAAGA ACTGCAGGAC ACTCAGCTCA AACTGGCCAG 
CACAGAGGGT CACGGACATG GACACGAGCG AGCACCTGTG AATTCCCACC GAGGGCCTCT 
GCGCATGCAC GGAGGCTGGG AGGACCCCGG GGCTGCTGAG AAGGGGTTTG GGGCCTTGGC 
CTGATTGTGC AGACATTCTG TAGGTGTAAT GCCAGCAGGC CCTGCATTGC CTGCAGAGTC 
CATGA 



(2) INFORMATION FOR SEQ ID NO:33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 468 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 18 of HIPl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

TTACTGGCTT GGACCTCATT GGCCATGACT TGAGCTAAGA TGCTAAGAGC CCCAGCCAGG 
TCATCCTGCT CAGGTTCATT ATGGAGTCTA GGGCAGACTC TCACCTCCCT GGACCATTTT 
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TAGAATCTAT GTGCCAGCTT GCCAAAGACC AACGAAAAAT GCTTCTGGTG GGGTCCAGGA 180 

AGGCTGCGGA GCAGGTGATA CAAGACGCCC TGAACCAGCT TGAAGAACCT CCTCTCATCA 240 

GCTGCGCTGG GTCTGCAGGT ACACTTGCAA TTGCCCAGCT GGCAGGGGCC AGGTCCTTAC 300 

AGCCTGAGAC TCTGTTGATG TTGAATCTCA TGTGAGACTT AGCTCAGGGG CTCTCAGCCC 360 

AGCAGCATGT CAGCATTACC TTAGGGGCGC CCAGGCCCCA TCCTAGATCA GTTACATGTG 420 

GAAACTCTGT GCATTAGTGC CTATACACTA GTATTTTAGT ATTTTCTT 468 



(2) INFORMATION FOR SEQ ID NO:34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 393 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 19 of HIPl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 



CACTAGTAAG CTCCTCCATT CAGTGCTTAA TTAACGAGGA TGAAGCCAGC TATGAGAACT 60 

TGCTCTGACC TTGCCCTGTG TTCCCTCTCA CAGATCACCT CCTCTCCACG GTCACATCCA 120 

TTTCCAGCTG CATCGAGCAA CTGGAGAAAA GCTGGAGCCA GTATCTGGCC TGCCCAGAAG 180 

GTAAGAATGG CCAAGGACAG TCTCTGTCGG CTAGTGATGG CCAGACAGGG TTCAGAAGCA 240 

CCTGAATGCG GGGATAGTGA CAGGTCCCTC TGCATCAAGA AAGGCATGTA GGCAACTCAT 300 

ACAAGAAAGG CATGTAGGCA ACTCATAAAA CGGGAGGAGA GGGTATGAAA GTGTCACCAT 360 

CAACCAGACC TGAGAAACTT CTCTTTCCAA TCC 393 



(2) INFORMATION FOR SEQ ID NO:35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 421 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: lineai- 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 20 of HIPl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 



GGCCTGCCCA GAAGGTAAGA ATGGCCAAGG ACAGTCTCTG TCGGCTAGTG ATGGCCAGAC 60 

AGGGTTCAGA AGCACCTGAA TGCGGGGATA GTGACAGGTC CCTCTGCATC AAGAAAGGCA 120 

TGTAGGCAAC TCATACAAGA AAGGCATGTA GGCAACTCAT AAAACGGGAG GAGAGGGTAT 180 

GAAAGTGTCA CCATCAACCA GACCTGAGAA ACTTCTCTTT CCAATCCTGG CAGACATCAG 240 

TGGACTTCTC CATTCCATAA CCCTGCTGGC CCACTTGACC AGCGACGCCA TTGCTCATGG 300 

TGCCACCACC TGCCTCAGAG CCCCACCTGA GCCTGCCGAC TGTGAGTACT GGGGCATGAG 360 

GGGCTGTTCA TGGACCAGGG GAGCAGGGGG CCTTTAAAAG TCTCTGTTGG GCCGGGCGCA 420 

G 421 



wo 99/60986 ^ ^ PCT/US99/11743 

(2) INFORMATION FOR SEQ ID NO:36: 

(i) SEQUENCE CHARACTERISTICS. 

(A) LENGTH: 498 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 21 of HIPl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 

AGGCCGAGGC AGGAGAATCG CTTGAACTCA GGAGGCGGAG TTTGCAGTGA GCCGAGATGG 60 

CGCCACTGCA CTCCAGCCTG GGCAACAAGA GCGAGACTCC ATCTCAAAAA AAAAGTGTCT 120 

ATTGCCTTGT ATCTCCAGCA CTGACCGAGG CCTGTAAGCA GTATGGCAGG GAAACCCTCG 180 

CCTACCTGGC CTCCCTGGAG GAAGAGGGAA GCCTTGAGAA TGCCGACAGC ACAGCCATGA 240 

GGAACTGCCT GAGCAAGATC AAGGCCATCG GCGAGGTACT TGGAGTAGTA TCATTGAGGA 300 

GCATTGTTAT TCTTCTGGGT GTGCGTGCTG GTGAATGGCC AGGGAATCGG TGATGTTCTG 3 60 

AGCTAGTTCT TTCTGCACTT AGAACTTGAT TCTAGAAAGA GATTGTTAAA ATTGGAAAAT 420 

CTGGCCGGGT GCAGTGATTT ATGCGTGTAA TCCCAGCACT TTGGGAGGCC GAGTCAGGAG 480 
GATCACTTGA GGCTAGAC 



498 



(2) INFORMATION FOR SEQ ID NO:37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 427 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 22 of HIPl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 

CCCTGTGGCT TGCAGAAGGT GTTTGCTGGG TGGCCTCCTG CCTTGCCATC TTGTAAGGGT 60 
TACAGATGGC AGAGGAGAAG AGACAGGAGG CCCCAAGGTC AGTTCAGCCT TTGTGATGTG 120 
TTCACAGGAG CTCCTGCCCA GGGGACTGGA CATCAAGCAG GAGGAGCTGG GGGACCTGGT 180 
GGACAAGGAG ATGGCGGCCA CTTCAGCTGC TATTGAAACT GCCACGGCCA GAATAGAGGT 240 
AGGAGGTTCC TGCAGGATCT CCTGAAACGA TGCCTTTGCA GCTGCCCTTC TGCAACACTG 300 
CTCATTAAAC ATGTCACAGT CGTTCATTAA GGCCATGGCA ACCCCCTAAG ACAGAAACCA 360 
GAATTTGCCA GGCACAGTGG CTCATGCCTG TAACCCCAGC ACCTTGGGAG GATCACTTGA 420 
GTCCAGG 



427 



(2) INFORMATION FOR SEQ ID NO:38: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 367 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 23 of HIP 1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 



CCCCCTGAAT AGGTTAGAGT CTGGATTCTT TTCTGACTCT CTCAAGAATG TGGGCAGGGA 60 

CTTGGGGACT TCCAGATTCA GGTTTCCCAG CTACCACACG ATGTTGGACT GAAAGTATAG 120 

TAAGACATTA GTGGATCCTT AATATTCAAG GCACATTTAG AAACCATGCT TCTTTTTCAC 180 

AGGAGATGCT CAGCAAATCC CGAGCAGGAG ACACAGGAGT CAAATTGGAG GTGAATGAAA 240 

GGTCGGTCTG AGCGGCATGG TGGGACCTAG GGGAGCAGGA TCTGTCTTCC TGACATTGGT 300 

CTATACTTTG CATACTTATT AGGGAATTAG AGGAGAGCAG TAGCAGCCAC GGGGAAGGGC 360 

TGAGTTG 367 



(2) INFORMATION FOR SEQ ID NO:39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 502 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 24 of HIPl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 



CCCCGCAGAA TGTTCCAGCA ACCTCAGCAC CCTTCTTACC TCCCTTTCCC ATTCCAAGCT 60 

TGCCTTTGGC TAGGAGTGGG GAAGAGAACC GTCGTGTTCA TTGATCTTGG ATCTTGATCT 120 

CAGTGTATCC TCGACTTGTT TGTTTGGCAG GATCCTTGGT TGCTGTACCA GCCTCATGCA 180 

AGCTATTCAG GTGCTCATCG TGGCCTCTAA GGACCTCCAG AGAGAGATTG TGGAGAGCGG 240 

CAGGGTGAGC GTGGGTGTGG GCCCTGGGCA GGAAGAGGAG GCATCGGTGA CAGACTCCCG 300 

CTCCAACGGA CTCTGTGATG CTGCCGTCTT ACTCTGTGTG TCCACCTGAG TACAGAGCAG 360 

CCACTCCTGT AGATATCAGC AGAGGCCCTG GGGAGAAGTC AGAGCTCCAG GACCTCCCCA 420 

GAGGGTGGCC AGGCATGTGT CCCAACTCCA GCTCCCTTCG CACAGGCAGA CATTGTTGGA 480 

ACTTGCTGTG GGAGCCCTTT TT 502 



(2) INFORMATION FOR SEQ ID NO:40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 437 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
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(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 25 of HIPl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 

TTTTGGTCTC TGAATCTTCT TCTTTTTTGT AAAATGGGAA TACTAATGCT TATGTCTCAG 60 

AGTTACTATG AGGATGATTT GGGATAATAT ATGTATAAAA GCACCTGCCA TATAGTACAT 120 

GCTCAATAAA AGGTGGCTAT TACTATTTTT TATTTCCCTA GGGTACAGCA TCCCCTAAAG 180 

AGTTTTATGC CAAGAACTCT CGATGGACAG AAGGACTTAT CTCAGCCTCC AAGGCTGTGG 240 

GCTGGGGAGC CACTGTCATG GTGTAAGTAT CTATTGGTAC CAAGGGTCCT CCCATGACCC 300 

CTCTTCCATT GATCCACTCC AAACAATAGC TAAGGAGGGA AAAAAAAATC TGTCCCTTAG 360 

AAATAAACTA TTGATCAGGA AGTCAATAGG ACCGAGTTTA CAAGGGAGCC TGGCTCTCCC 420 

AGGGGACACA GGGCAGG 437 



(2) INFORMATION FOR SEQ ID NO:41: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 351 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 26 of HIPl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 

GGGAGCCTGG CTCTCCCAGG GGACACAGGG CAGGCAGCCT CCCCTCCCTG TTTAGCCAAG 60 
GGCGATGGGG TGGTCTGGAG GTGGGATTGT GGAGGAGTTG CAGCTCATTT GCCCGTAACC 120 
TAGTCCCTCT TGTCGTTTTC CATCAGGGAT GCAGCTGATC TGGTGGTACA AGGCAGAGGG 180 
AAATTTGAGG AGCTAATGGT GTGTTCTCAT GAAATTGCTG CTAGCACAGC CCAGCTTGTG 240 
GCTGCATCCA AGGTAGGACC TGGCTGGACC TCCTAGGACG CTGGAAGGCC TGGTTAGAGA 3 00 

GTACTAGGCT AGGTTAAAGA GTACTTGGCT GCGTTAGGCA GTACTTGGCT G 351 

(2) INFORMATION FOR SEQ ID NO:42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 41 8 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 27 of HIPl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 

Qrp,j.,pp,p^^^,j, GATAGATATG TCAGGAGCTG ACTATAGTCA GCAGATTTTG AGAAGCTGAT 60 

TGGTGATTGC CGTTTGGCCC ACATATGTTT GCTAAGAACC ATCAGAGCAA TTATCTGATT 120 

CAGTCCTTGT TGCTCTAGGT GTTGTATGAA CCTAAATCTG CTTTGTCCTG GTAGGTGAAA 180 
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GCTGATAAGG ACAGCCCCAA CCTAGCCCAG CTGCAGCAGG CCTCTCGGGG AGTGAACCAG 240 

GCCACTGCCG GCGTTGTGGC CTCAACCATT TCCGGCAAAT CACAGATCGA AGAGACAGGT 300 

AGCCTTTCCA AAGGGACCCT TTTCTTACCC ACCCTGTTGA GCTCTTCTCT GCATCCTTCC 360 

CTGTGATCCC AACCAAATCC CACAGGACTG TGTCTAAATT CTTTCATATT TTTCATCT 418 



(2) INFORMATION FOR SEQ ID NO:43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 279 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGE^AL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 28 of HIPl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 



TTTCCACAGA GCATTGGCAT TGGCTGCCTC TCAGGTGCCA GTCAGCCAGG GTAGAATTTG 60 

ATGAGACCTT CTTGTTTCCA TCCTTGCAGA CAACATGGAC TTCTCAAGCA TGACGCTGAC 120 

ACAGATCAAA CGCCAAGAGA TGGATTCTCA GGTTAGGGTG CTAGAGCTAG AAAATGAATT 180 

GCAGAAGGAG CGTCAAAAAC TGGGAGAGCT TCGGAAAAAG CACTACGAGC TTGCTGGTGT 240 

TGCTGAGGGC TGGGAAGAAG GTAAGCTGAC TCAAAGGAT 279 



(2) INFORMATION FOR SEQ ID NO:44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3715 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 29 and partial cds of HIPl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 



AACATAAATT ATCATTGTCT TTTAGGAACA GAGGCATCTC CACCTACACT GCAAGAAGTG 60 

GTAACCGAAA AAGAATAGAG CCAAACCAAC ACCCCATATG TCAGTGTAAA TCCTTGTTAC 120 

CTATCTCGTG TGTGTTATTT CCCCAGCCAC AGGCCAAATC CTTGGAGTCC CAGGGGCAGC 180 

CACACCACTG CCATTACCCA GTGCCGAGGA CATGCATGAC ACTTCCCAAA GACTCCCTCC 240 

ATAGCGACAC CCTTTCTGTT TGGACCCATG GTCATCTCTG TTCTTTTCCC GCCTCCCTAG 300 

TTAGCATCCA GGCTGGCCAG TGCTGCCCAT GAGCAAGCCT AGGTACGAAG AGGGGTGGTG 360 

GGGGGCAGGG CCACTCAACA GAGAGGACCA ACATCCAGTC CTGCTGACTA TTTGACCCCC 420 

ACAACAATGG GTATCCTTAA TAGAGGAGCT GCTTGTTGTT TGTTGACAGC TTGGAAAGGG 480 

AAGATCTTAT GCCTTTTCTT TTCTGTTTTC TTCTCAGTCT TTTCAGTTTC ATCATTTGCA 540 

CAAACTTGTG AGCATCAGAG GGCTGATGGA TTCCAAACCA GGACACTACC CTGAGATCTG 600 

CACAGTCAGA AGGACGGCAG GAGTGTCCTG GCTGTGAATG CCAAAGCCAT TCTCCCCCTC 660 

TTTGGGCAGT GCCATGGATT TCCACTGCTT CTTATGGTGG TTGGTTGGGT TTTTTGGTTT 720 

TGTTTTTTTT TTTTAAGTTT CACTCACATA GCCAACTCTC CCAAAGGGCA CACCCCTGGG 780 

GCTGAGTCTC CAGGGCCCCC CAACTGTGGT AGCTCCAGCG ATGGTGCTGC CCAGGCCTCT 840 
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CGGTGCTCCA 


TCTCCGCCTC 


CACACTGACC 


AAGTGCTGGC 


CCACCCAGTC 


CATGCTCCAG 


900 


GGTCAGGCGG 


AGCTGCTGAG 


TGACAGCTTT 


CCTCAAAAAG 


CAGAAGGAGA 


GTGAGTGCCT 


960 


TTCCCTCCTA 


AAGCTGAATC 


CCGGCGGAAA 


GCCTCTGTCC 


GCCTTTACAA 


GGGAGAAGAC 


1020 


AACAGAAAGA 


GGGACAAGAG 


GGTTCACACA 


GCCCAGTTCC 


CGTGACGAGG 


CTCAAAAACT 


1080 


TGATCACATG 


CTTGAATGGA 


GCTGGTGAGA 


TCAACAACAC 


TACTTCCCTG 


CCGGAATGAA 


1140 


CTGTCCGTGA 


ATGGTCTCTG 


TCAAGCGGGC 


CGTCTCCCTT 


GGCCCAGAGA 


CGGAGTGTGG 


1200 


GAGTGATTCC 


CAACTCCTTT 


CTGCAGACGT 


CTGCCTTGGC 


ATCCTCTTGA 


ATAGGAAGAT 


1260 


CGTTCCACTT 


TCTACGCAAT 


TGACAAACCC 


GGAAGATCAG 


ATGCAATTGC 


TCCCATCAGG 


1320 


GAAGAACCCT 


ATACTTGGTT 


TGCTACCCTT 


AGTATTTATT 


ACTAACCTCC 


CTTAAGCAGC 


1380 


AACAGCCTAC 


AAAGAGATGC 


TTGGAGCAAT 


CAGAACTTCA 


GGTGTGACTC 


TAGCAAAGCT 


1440 


CATCTTTCTG 


CCCGGCTACA 


TCAGCCTTCA 


AGAATCAGAA 


GAAAGCCAAG 


GTGCTGGACT 


1500 


GTTACTGACT 


TGGATCCCAA 


AGCAAGGAGA 


TCATTTGGAG 


CTCTTGGGTC 


AGAGAAAATG 


1560 


AGAAAGGACA 


GAGCCAGCGG 


CTCCAACTCC 


TTTCAGCCAC 


ATGCCCCAGG 


CTCTCGCTGC 


1620 


CCTGTGGACA 


GGATGAGGAC 


AGAGGGCACA 


TGAACAGCTT 


GCCAGGGATG 


GGCAGCCCAA 


1680 


CAGCACTTTT 


CCTCTTCTAG 


ATGGACCCCA 


GCATTTAAGT 


GACCTTCTGA 


TCTTGGGAAA 


1740 


ACAGCGTCTT 


CCTTCTTTAT 


CTATAGCAAC 


TCATTGGTGG 


TAGCCATCAA 


GCACTTCCCA 


1800 


GGATCTGCTC 


CAACAGAATA 


TTGCTAGGTT 


TTGCTACATG 


ACGGGTTGTG 


AGACTTCTGT 


1860 


TTGATCACTG 


TGAACCAACC 


CCCATCTCCC 


TAGCCCACCC 


CCCTCCCCAA 


CTCCCTCTCT 


1920 


GTGCATTTTC 


TAAGTGGGAC 


ATTCAAAAAA 


CTCTCTCCCA 


GGACCTCGGA 


TGACCATACT 


1980 


CAGACGTGTG 


ACCTCCATAC 


TGGGTTAAGG 


AAGTATCAGC 


ACTAGAAATT 


GGGCAGTCTT 


2040 


AATGTTGAAT 


GCTGCTTTCT 


GCTTAGTATT 


TTTTTGATTC 


AAGGCTCAGA 


AGGAATGGTG 


2100 


CGTGGCTTCC 


CTGTCCCAGT 


TGTGGCAACT 


AAACCAATCG 


GTGTGTTCTT 


GATGCGGGTC 


2160 


AACATTTCCA 


AAAGTGGCTA 


GTCCTCACTT 


CTAGATCTCA 


GCCATTCTAA 


CTCATATGTT 


2220 


CCCAATTACC 


AAGGGGTGGC 


CGGGCACAGT 


GGCTCACGCC 


TGTAATCCCA 


GCACTTTGAG 


2280 


AGGCTGAGGT 


GGTAGGATCA 


CCTGAGGTCA 


GGAGTTCAAG 


ACCAGCCTGT 


CCAACATGGT 


2340 


GAAACCCCCA 


TCTCTACTAA 


AAATACCAAA 


AATTAGCCGA 


GCGTAGTGAC 


GGGTGCCCGT 


2400 


AATCCCAGCT 


ACTCAGGAGG 


CTGAGACAGG 


AGAATCACCT 


GAACCCCAGA 


GGCAGAGGTT 


2460 


GCAGTGAGCT 


GAGATCACGC 


CATTGTACTC 


CAGCCTGGGC 


AACAAGAGCA 


AAACTCCGTC 


2520 


TCAAAAAAAA 


AAAAAAATTA 


CAAATGGGGC 


AAACAGTCTA 


GTGTAATGGA 


TCAAATTAAG 


2580 


ATTCTCTGCC 


CAGCCGGGCA 


CAGTGGCGCA 


TGCCTGTAAT 


CCCAGAACTT 


TGGGAGGCCA 


2640 


AGACGGGATG 


ATTGCTTGAG 


CTCAGGAGTT 


TGAGACCAGG 


CTGGGCATCA 


TAGCAAGACC 


2700 


TCATCTCTAC 


TAAAATTCAA 


AAACAAAATT 


AGCCGGGCAT 


GATGGTGCAT 


GCCTGTAGTC 


2760 


TCAGCTAGTT 


GGGGAGCTAA 


GGTGGGAGAA 


TTGCTTGAGC 


TTGGGAAGTC 


GAGGCTGCAG 


2820 


TCAGCCCTGA 


TTGTGCCAGT 


GCACTCCGGC 


CTGGGTGACA 


GAGTGAGACC 


CGTGCTCAAA 


2880 


AAAAAAAAGA 


TTCTGTGTCA 


GAGCCCAGCC 


CAGGAGTTTG 


AGGCTGCAAT 


GAGCCATGAT 


2940 


TTCCCACTGC 


ACTCCAGCCT 


GAGTGACAGA 


GCGAGACTCC 


ATCTCTTTAA 


AAACAAACAA 


3000 


AAAATTATCT 


GAATGATCCT 


GTCTCTAAAA 


AGAAGCCACA 


GAAATGTTTA 


AAAACTTCAT 


3060 


CGACTTAGCC 


TGAGTCATAA 


CGGTTAAGAA 


AGCACTTAAA 


CAGAAGCAGA 


GGCTAATTCA 


3120 


GTGTCACATG 


AGGAAGTAGC 


TGTCAGATGT 


CACATAATTA 


CTTTCGTAAT 


AGCTCAGATT 


3180 


AGAATGGCTA 


CCCCATTCTC 


TAGACAAAAT 


CAAATTGTCC 


TATTGTGACT 


CTTCTAAAAA 


3240 


TGAAGATGAA 


GAGCTATTTA 


ATGACACACC 


TTGGATTAAA 


ACGGGAATCA 


CATCTTAAAG 


3300 


CTAAAAATGA 


ACCTGCAAGC 


CTTCTAAATG 


AGTCACTGAG 


CATCACTAGT 


GACAAGTCTC 


3360 


GGGTGAGCGT 


AAATGGGTCA 


TGACAAGATG 


GGACAGCAAC 


AAAATCATGG 


CTTAGGATCG 


3420 


ACAAGAAGTT 


AAAAAACAGC 


TGCATCTGTT 


ACTTAAGTTT 


GTAAGACAGT 


GCCCTGAGAC 


3480 


CTCTAGAGAA 


AAGATGTTTG 


TTTACATAAG 


AGAAAGAAGG 


CCAGACATGG 


TGTCTCACAC 


3540 


GTTTAATCCC 


AGCACTTTGG 


GAGGCAGGGG 


CGGGTGGATC 


ACCTGAGGTC 


AGGAGTTCAA 


3600 


GACTAGCCTG 


GCCAACATGG 


TGAAACCCCG 


TCTCTACTAA 


AAATACAAAA 


ATTAGCCGGG 


3660 


CATGGTGGCA 


GGCGCCTATA 


ATCCCAGCTA 


CTGGGGAGGC 


TGAGGCAGGA 


GAATC 


3715 
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(I) GENERAL INFORMATION: 

(i) APPLICANT:KftbhmAn, MicbHcl 

Hackam, AbigiaiJ 
ChopiH. VLkramjil iSiagh 
Nictvnliion, Donald W. 
Vallamcoiut, John P- 
RiHEpcTiDita M. 

(ii) TITLD or INVENTION: Apopwsis Modiilaioi* Thai Inl&ratjl with Ihc 
Huntin gtc n ' s Dfs&Bse C Sen e 

(ifi) NUMBHR OHSBQUENCES: 44 

(iv) OORRESPONDBNCE ADDRESS: 
(A) ADDRBSSED; Oppcdalil A J -arson 
CR) STRBt?]': PO Box 5270 

(C) OTY: Frisco 
CD) STATE: CO 

(E) COUNTRY: USA 
[H>ZIJ': S044^-S270 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Di«kcCI&» 3.50 intih, 1,44 Kb Blorugc 

(B) CX^MPUTHR: iEJM C^ompatiWa 

(C;) OPERATING SYSTEM: MS DOS 5.0 

(D) SOFTWARE: V/orilPcrfcc^l 

(vi } CUR R EN' r APPLI C : A'l 'lO N DATA : 

(A) APPUCATION NUMBER; 

(B) FHING DATE: 

(C) CLASSIFICATION: 

(vhi) ATTORNnY/AGRNT INFORMATION: 

(A) NAME: Larson, MBrinaT. 

(B) REGISTRATION NUMBER: 3203S 

(C) REFERENCKDOCKET NUMBER; UBC.P-0 1 3US3 
TELECX)MMt JNICATIGN INPOR MATION: 

(A) TELEPHONE: (910) 668-2050 

(B) TELEFAX: (970) 668-2052 



(2) INFORMATION Pf.lR SfJ?Q \0 NO: I : 
(i) SHQUHNCbi CHARAC^rHRlSTIChV 

(A) LENGTH: 1 164 

(B) TYPE.- nucleic acid 

(C) STRANPBDNESS: sirub 
CD>'lOP0UXiY: Unear 
(ii}MOLECULETYPE: cDNA 

(iii) HYPOTHETICAL; no 

(iv) ANTI-SBNSB: no 
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(vi) ORIOJMALSODRCR: 
(A) OMjANISM; hunuin 

(in) RBATURE: cDNA for HuDlliigliTi-inlcra(;(iiigproicin 
<xi)SBQURNC:HDJ:JSailPTlON: SHQ JD N0:1: 

ACAGCTQACA CCCTGCAAGG CCACCGGGAC CGCr'iViiATGG AGCAC'ITTAC 50 
AAAGTTGAAA GATCTfiTaiCT ACCGCTCCAG CAACCTGCAO TACTrCAAGG 100 
GG6TCATTCA UATCLVCCAG CTUCCTGAtJA ACC^CACL'CAA CTTCCTKGOA 150 
GCCTCAGCCC TGTCAGAACA TATCAGCCCT GTGGTX3GTGA TCCCTGCAGA 200 
GGCCTCA'I'CC CCCGACAGCG AGCCAGTCCT AGAGAAGGAT GACCTCATGG 250 
ACATGGATGC C'I'CTCAGCAG AAOI^'T ATOTG ACAACAAGTI" 'I'GA'iGACWl'C 3<J0 
TTTGGCAGTT CA'IXI^CAGCAG 'iXjA'l'CCC'I'aXl; AAT'i"l€AACA G'I'CAAAA'IXjG 3 50 
TGTGAACAAG GATGAGAAGG ACGALTTAAT TCAtJGtSACTA TACAGAGAGA 4 DO 
TCAGTGGATT GAAGGCACAG CTAGAAAAGA TGAAGACTGA GAGCCAGCGU 450 
GTTGTGCTGC A{?CTGAAG{3G CCACGTCAGC C;At;r:TGGAA<3 CAfiATCTGOC 5 DO 
CGAGCAGCAG CACCTCJCGGC AGCAGGtlGGC CRACIGACTGT OAATTCCTGfT B50 
GGGCAGAACT GGACGAGCTC AGGNGGCAGC GGGAGGACAC! CIGAGAAGGCT GOO 
GAGGGGaGCC TGTCTGAGAT AGAAAGGAAA GCTCAAGCCA ATG71ACAGCG 650 
ATATAGCAAG CTAAAOGAGA AG'I'ACAGCGA OC'l'GCri'CAG AACCACCCS'G 7 DO 
ACCTGCTGOG C3AAGAATGCA CAGGTGACCA AAt:AGGTGTC: IIATCJGCCAGA 750 
CAAGCCCAGG TAGATTTGRA ACGAGAGAAA AAAGAGCTGG AGGATTCGTT 800 
GGAGCGCATC AGTGACCAGG GCCAGCGGAA GACTCAAGAA CAGCTGGAAG S60 
rrCTAGAGAG CTTGAAGCAG GAACTTGGCA CAAGCCAACG GGAGCTTCAG 900 
GT'ltlM'GCAAG GCAGCCTGGA AACTTCTGCC CAGTCAQAAG CAAACTGGGC 950 
AGCCGAGITC GCCGAGCTAG AGAAGGAGCG GGACAGCCTG GTGAGTGGCG 10 DO 
CAGOTCA'l'AG GGAGGAGGAA TTATCTQCTC TTCGGAAAGA ACTGCAGGAC 1 050 
ACTCAGCTCA AACTGCCCAG CACAGAGGAA 'ITCTA'I'G'J'GCC AGCTQX3CCAA 1100 
AGACCAACGA AAAATGC:ttc: TCXDTGGG'GTC CAGGAAGGCT GCGCAGCAGG 1150 
TGATACAAGA CGCG 1164 

(2) IMTORMATTON PCJR SEg. ID NO:2: 
(f) SEQlJEiNCK CHAKACrERISnCS: 

(A) LENGTH: 386 

(B) TyPH: protein 

CD)T01'OLOGY: lineal- 
([i)MOLECULETYPB: proicin 
(lii) HYPOTHETICAL; ro 
(vi) ORIGINAL SOUROJ: 

(A) ORCrANrSM: Imman 

(ix) FliAl'URE: Huntingtm-imcraatrng protein 
(xi) SEQieNCB DESCRIPTION: SEQ JD N0;2; 

l^r Ala Asp Thr Leu G l.n Gly Hifj Arq Asp A\-3 Phe M«t. Glii Gin 
a 5 10 ' lt> 

Ph«? T>ir Lyfc Leu Ly£ Asp Lea Plio Tyr Arg Ser Scr Asn lcu Gin 



20 



30 



Tyr Phft Ly£j Arg val ila Qln lie Fro tiln Lc-u Pro Glu Asn Pro 
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40 



45 



Pa:o Asn Fhe Leu Arg Alu Sev Alii Leu S&x- Glu HiK He Swir Pnro 

50 55 <>0 

Val Val Va' life Pro Ala Glu Ala Hex Sor Pro Asp Ser Glu Pro 

65 70 "75 

Vfitl Leu Glx? LyFj A?sp Afcp Leeii Met Aiip Met Asp Ala Sfer Gin Gin 

«0 90 

Asn Leu Phc Asp Asn Lys Phe Abp Arsp P'be GHy S^.r- Pei- Ser S^r 

95 lOO 3 0ti 

Scr Asp Pro Ph«s Aku Phe Aen sex Qlth Asn Gly Val Asn Lys Asp 

110 ll-^i 

Glxi Lys Aep His Leu lie Glu Arg Leu Tyr Arg Glu He Ser G.lv 

125 130 135 

Leu T.yB Ala Gin Lcu Glu Asn Met Lys Tbr Glu 8er Gin Arg Val 

1^0 145 150 

val LGu Gin Leu Lys Gly Ki.h Viil S«r Glu Leu Rlu Ala Asp Leo 

155 IfjO 165 

Ala Glu Crln Glj^i His Leu Arg Gin Gin Ala Ala Asp Asp Cyn Glu 

170 175 180 

Phfi Leu Arg Ala Glia Leu Asp Glu Lou Arg Gin Arg Glu Asp Thr 

185 190 195 

GJii Lyt Ala Gin Arg Ser Leu Ser Glu He Glv Arg Lya Alfi Gin 

200 205 210 

Ala Asn Glu Gin Arg Tyr Ser Lyi» Leu LyS Glu LyS Tyr Hax Glu 

215 220 225 

L^i val Gin Aan His Ala Asp Leu Leu Arg T.yp Asn Alii Gl" Val 

230 235 ?'40 

Thr Lys Gin Val Ser M«t Ala Arcj Gin Ala Gin val Asp Leu Glu 

245 250 255 

Arg GIm LyH Lye Gl-a Leu Glu Asp Ser Leu Glu Arg lie Ser Asp 

260 265 270 

Gin Gly Gin Arg bys Thr Gin Gl" Gin Lma Glu Viil Leu Glu scr 

275 f'SO 2B5 

Leu l>yB Gin Glu LGU Gly 'ilir Ser Gin Arg G.lu Leu Gin Val Leu 
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300 



Gin Gly Ser Leu Glu Thr Sor Ala CIji Sex f?iii Ala Asn Trp Ala 

305 .'ill) 315 

Ala Glu Phe Ala Glu Lou Clu Dys Glu Aj.g A&p Ser Leu Val Ser 

320 .32J> 330 

G.ly Al<j Al<j Hiw Arg Glxi G3u Giu Leu Ser Ala Leu Arg Lys Glu 

333 34D 34a 

Lea Gin Asp 'Itor Gin Leu Lys Leu Ala Sf?r Th-r GJ.u Glu Ser Met 

350 3!i5 360 

Cyfi <3ln Leu Ala Lys Asp c?lii Arg r.ym Met Leu Leu val Gly Scr 

5ti!> 370 375 

Arg Lys Ala Ala Glu Gin Val lie Gin Asp Ala 

380 3aS UBS 



(2) INFORMATION FOR SBQ ID N0:3: 
(i) SEQlJTlWCEaiARACTnRISTICS: 

(A) LBNCIH: 479(y 

(B) TYPE: nucleic acid 

<CJ STRANDRDJNBSS; uinglc 

{D) TI>PULI>C}Yr Ihiear 
{ir)M0LJBCULHTV1*H: cDNA 
(iii} HYPOTHETICAL: no 
(iv) ANTI-SENSB: no 
(vi) ORICTINA[.SC)URCR: 

{A) ORGANISM: human 

(ix) FEATLFRB; cONA for Hiraliuglin-intcTactiTig protein 
(Xi)SRQURNCR DRSCRTPTTON; SHQ in NO: 3: 

CAGTGTACGG TTGATCA'i'AT AACGCCGCCC GCGtlGGATTC; OTTTATATAT .^0 
CCJCAAATTl^A TNTAC?C«;GC;G f;G<;GGATGGW CAGAGATVTC GCTTCATTAG 100 
GCCATTATAA RCAGGAAGGG TTTCAAGGAA ftAAAACCCAG AAA6TGCATA 150 
TTGCACCCAC CATGAGAAAG GGGCAACAGA CCrTN-i-Gl'l'M TGT1'N'JX::AAC 200 
CGCCTGCTTC TGTTTTAGCA ACGC7U3TG'n' I'l^GGlXSGAAO TTGTCCCATG 250 
TCTTCCACAA ANTC'l-'lH^CGA GATCCACACL' CGAAC«TCC:T <5AAGGACTTT 300 
GTGAGATACA GAAA'IGAATT GAGTGACATC AGCAtifiATRT <3GGGCCACCT 350 

GAGCGAGGGG tatc^gc<:a<3c totccagcat otacctgaaa C'TCSCTAAGAA 400 

CCAAGATGGA GTACCACACr AAAAATCCCA GGTTCCICAGC! CAACCTGCAG 450 

ATGAGTGACC GCCAGCTGGA CGAGGCTGGA GAAAGTGACG TG?iACAACT'J' 500 

TTTCCAGTTA ACAGTGGAGA TGTTTGACTA CCTGGAG'i'G'J^ GAAC'l'CAACC 550 

TCTTCCAAAC AGTATTCAAC TCCt.TJGGACA TGTCCCGCTC TGTGTCCGTG 600 

ACGGCAGCAG GGCAGTGCCG CCTCGCCCCG CTGATCCAGG TCATCTTGGA 550 

CTGCAGCCAC CTTTATGAC!? ACACTGTCAA GCl'TC'l'Cil'C AAACTCCACT 7O0 

CCTGCCTCCG AGCTGACACC CTGCAAGGCC ACCGGGACCG CTTCATGGAG 75 0 
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CTTCAAGCGG 
TCCTGCGAGC 

CCTCATGGAC 
ATGACATCTT 
CAAARTGGTGf 
CAGACAGA'iX; 
GCCAGCGGG'l' 
GATCTGCCCG 
ATTCCTGCG<i 
AGAAGCCTCZA 
GAACAGCGAT 
CCACGCTGAC 
TQGCCAGACA 
GATTCOTTGC 
GCTGRAAOTT 
AGCTTHAOGT 
AACTC5GC;r.AG 
GAGTGGCGCA 
TGfCAGGACAC 

cttgccaaag 
ggagcaggtg 

tccagctgc:a 

CCCAGAAGAC 
'IGACCAGCGA 
CCTGAGCCTG 
AACCCrCGCC 

ccgacagcag 
gaggacctcc 

CCTGGTGGAt: 
CGGCCAGAAT 
O'l^CAAATrGG 
GCAAGCTAT'l' 
TTGTGOAGAC 
AACTCTCGAT 
GGGAGCCACT 
GGAAATTTGA 
GCCCAGCTTC 
CCTAGCCCAG 
GCG'n^TGGC 

aacatggact 
ggattctcag 

GICAAAAACT 

gctgagcgct 
agtggtaacc 

TAAATCCTTG 
AATCCTTGGA 
AGGACATGCA 
GTITGGACCC 



AGTTGAAAGA 
CTCATl'CAGA 
CTCAGCCCTG 
CCTCATCCCC 
ATGGATGCCT 
TG6CAGTTCA 
TGAACAAGGA 
AG'i«GGATTGA 

tgtgctgcag 
agcagcagca 
gcagaac'i'gg 

ATAGC'AAGCT 

CTGCTGCGGA 
AGCCCAGGTA 
AGCGCA'ICAG 
CTAGA<>AGC'r 
TdTGtrAAGGC 

{:c:GAGa"rccjc 

GnTCATAGGC? 

TCAGCTCAAA 
ACCAACGftAA 
Al'ACAAGACG 
'IGGGTCTGCA 
TtlGAGCAACT 
ATCAGTGGAC 
CGCCATTGCT 
CCGACTCACT 
TACCTGGCCT 
AGCCA'lGAGG 
•iGCCCAGGGG 
AAGGAGATGG 
AGAGGAGATG 

aggtgaatga 

caggtgctca 
ccccacoggt 

CGACAGAACG 
GTCATGGTGG 
G<3AGCTAATG 
TGGCTCCATC 
CTGCAGCAGG 
CTCAACCATT 
TCTCAAOCAT 
GTTAGGGTGC 
GGGAGAGCTT 
GGGAAGAACG 
GAAAAAGAAT 

ttacctatct 

GTCCCAGGGG 
TGACACTTCC 
ATGGTCATCT 



TCTG'lTC'l'AC 
TCCCCCAGC'l' 
a>CAGAACATA 
CGACAGCGAG 
CTCA{3CAGAA 
TTCAGCAGTG 
TGAGAAGGAC 
AGGCACAGC'l' 
CTGAAGGGCe 
CC'IXiCGGCAC 

acgagc'ix::ac 
tl'tgagatag 
aaaggagaag 
agaatgcaga 

GATTTGGAAC 
'ISSACCAGGCC 
•ICAACCAGGA 
AUCrCTSGAAA 
CGAGCTAGAG 
AGGARGAATT 
CTGGCCAGCA 
AATGCTTCTG 
CCCTGAACCA 
GATCACCTCC 
GGAGAAAACiC 
TTCTCCATTC 
CATGGTGCCA 
GACCGAGGCC 
CCCTG6AGGA 
AACTGCCTGA 
ACTGGACAl'C 
CGGCCACTTt: 
nT<?At3C7UiAT 
AAGGATCCTT 
TCGTGGCCTC 
ACAGOAl'CCC 
ACT'l'A'I'CTe.'A 
ATt^CAGCTGA 
GTGTGTTCTTC 
CAAGGTGAAA 
CCTCTCGGGG 
TCCGGCAAA'l' 
GACGCTGACA 
TAGAGCTAGA 
CGGAAAAAGC 
AACAGAGGCA 
AOAGCCAAAC 
CGTGTGTGTT 
CAGCCACZACC 
CAAAGATC:rX? 
CTGTTCTTTT 



CGCTCCAGCA 
GCCTGA(I!VAt: 
TL'AGtXJGTGT 
CCAGTCCTAG 
TTTATTTGAC 
ATCCCTTCAA 
CACa'l'AA'iM'G 

acaaaacatg 
al'gtcagl'ga 

caggcggc!c:g 
gaggca(3c:gg 
aaaggaaagc 

TACAGCGAGC 
GGTGACCAAA 
GAGAGAAAAA 
CAtSCGGAAGA 
ACTTGGCACA 
CTTCTGCCCA 
AAGGAGCGGG 
ATCTGCrCTl' 
CAGAGG2!kATC 
G'l'GGGGl'CCA 
GCrrCAAGAA 
rCTCCAHGGT 
TGGAGCCAGT 
C!ATAACCCTG 
CCACCTGCCT 
'I^TAACCAG'l' 
AGAGGGAAGC 
GCAACATCAA 
AACCAGGAGG 
AGCTGCTATT 
CCCGAGCAGG 
CGTTGCTGTA 
'I'AAGGACC'lx:: 
CTAAAGAGTT 
GCrCTCCAAGC; 
TCTGGTGGTA 

atgaaattgc 
gctgataagg 
agtgaaccag 
cacagatcga 
c:agatcaaac 
aaatgaattg 

ACTACGAGC'l- 
TCT{.'C:At:CTA 

t:AAGArnrnA 

ATTTCCCCAG 
ACTGCCATTA 
TCCATA6CGA 
CCCGCCTCCC 



Af?CTGCAGTA 
rCAC'CCAACT 
GGTGGTGATC 
AGAAGGATGA 
AACAAGTTTG 

'J'1"ik;aacaGt 
agcgactata 
aagactgaga 

GrTGGAAGCA 

acgac:tgtga 
gaggacaccg 
tcaagcc?^t 

TGGTTCAGAA 
CAGG'I\3'ICCA 
AGACCTGGAG 
CTCAAGAACA 
AGCC'AACGGG 
GTCAG/iAGCA 

acagcctggt 
cggaaagaac 
tatgtgccag 

CGAAGGCTGC 
GCTL'LTCTCA 
CACATCCATT 
ATCTGGCCTG 
CTGGCCCACT 
CAGAGCCCCA 
A'IGGCAGCGA 
CTl'GAGAA'IXi; 
GCeCATCCGt: 
AGCTGGGGGA 
GAAACTTGCA 
AGACACAGGA 
CCAGCCa'CAT 
CAGaGaGAGA 
TTATt^CCAAG 
CTGTGGGCTC? 

caaggcagag 
tgctagcaca 
acagccccaa 
gccactgccg 
agacacagac 
gccaagagat 
cagaaggagc 
'i'gctggtgut 

GACTGCAAGA 
TATGTCA<3Tt3 
CCACAGGCCA 
CCCAGTGCCG 
CACCCTTTCT 
TAGTTAGCAT 



fiOO 

90D 
950 
iOOO 
1050 
ilOO 
IIBD 
1200 

1300 

1-550 

1400 

1450 

1500 

.1&I50 

1600 

1650 

1700 

1750 

IBOO 

1G50 

1900 

1950 

3000 

2050 

2100 

2150 

2200 

2250 

2300 

2350 

MO 

2450 

2500 

255D 

26Q0 

5650 

ii7ft0 

2750 

2800 

2650 

2900 

2950 

3000 

3050 

3100 

.3150 

3300 

3250 

3300 
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CCAGGCTGGC CA6TGCTGCC CA'i'GAGCAAG CCTAQGTACn AAGA<3CWl3TG 33bD 

GTGGGGGGCA GGGCCACTCA ACAGAGAGGA CCAACATCCA GTCCTGCTGA 3401) 

CTATTl'GACC CCCACAACTiA 'i-GCGTATCCT TAATA£TAf5f3A C5CTf3CTT(?TT 3451) 

GTTTGTlGAC AGC'1"1'GGAAA GG<5AAGATCT TATGHCTTTT CTTTTCTGTT 350D 

TTCTTC'IiCAG 'I"Ca'r'i''l'CAGT TrCATCATTT GCACAAACTT GTOAGCATCA 3550 

GAGGGCTCAT GGATrCCAAA CCAGGACACT ACTCCTGACTAT CTGCACAGTC 3600 

AGAAGGACGG CAGCiACTGTC CTGGCTCJTC5A ATGCCAAAGC CATTCTCCCC 3650 

CTCTTTGGGC AGTCSCCATRG ATTTCCACTG CTTCTTATGG TGCTrGGTrG 3700 

GGTTTTTTGG TTTTGTTTTT TTTTTTTAAG TTTCACTCAC ATAGCCAACT 3750 

CTCCCAAAGG GCACACCCCT GGGGCi^GiAGT CTCCAGCGCC crcXXJAArTQT 3800 

GGTAGCTCCA GOGATGGI'GC 'iXjCCCAGGCG TdTCGGTCJHT CCATCTCCGC 3850 

CTCCACACTt; AOCAAGTGCT GGCCCACCCA RTCCATGCTC CAGGGTCAGG 3900 

CGGAGCTOCT GAGTGACZACSC: TTTCCTCTAAA AAGCAGAAGG AGAGTGAGTG 3S50 

CCTTTCCCTC CTAAAGCTQA ATCCCGGCGG AAAGCCTCTG TCCGCCT'P'i'A 4D00 

CAAGGGAGAA GACAACAGAA AGIAGGGACAA GAGGGTTCAC ACAGCCCAG'l' 4050 

TCCCGTGACG AGGCTCAAAA ACTTGATCAC ATGCTTGAAT GGAGCTGCTCi 4100 

AGATCAACAA CACTACTTCC CTGCCGGAAT GAAC'lXi'i'CCC TGAATGCTCT 4150 

CTGTCAAGCG GGCCGTCTCC Ca"J-GGCCCAC agacggagtc; tgggagtgat 4^00 

TCCCAACTCC TTI'C'IXUCAGA CCSTCTGCCTT GGCATCCTCT TGAATAGGAA 4250 

GATCGTTCCA CTTTCTACGC: AATTRACAAA CCCGGAAGAT CAGATGCAA'l" 43 OC 

TGCTCCCATC AGGGAAQAAC CCTATACTTG GTTTGCTACC CTTAGTAn'U' 43 50 

AQTACTAACC TCCCTTAAGC AGCAACAGCC TTiCAAAGAGA TGCTI'GGACC 4400 

AAl'CAGAACT TCAGGTGTGA CTCTAGCAAA GCI'CA'l'C'l'Tl' O'lXSCCCGGCT 4450 

ACATCAGCCl'' TCAAGAATCA GAAGAAAGCC AAGGTGCTGG ACTGTTACTG 4500 

ACTIXSGATCC CAAAGCAAGG AGATCAITTG GAGClK^'ITGC GTCAGAC3AAA 4 550 

ATC3AGAAAGC ACAGAGCCAG CGGCTCCAAC 'ICCTTTCAGC CACATGCC?CC 4600 

AGGCTCTCGC TGCCCTGTGG ACAGGATGAG GACAGAGGGf? ACATRAACAG i650 

CTTGCCAGGG ATGGGCAGCC CAACAGCACT TTTCCTCTTC TAGATGGACC 47 00 

CGAGCATTTA AGTGACCTTC TGATCTTGGG AAAACAGCGT CTTCCTTCTT 4750 

TATC'l'A'l'AGC AACTCATTGG TGGTAGCCAT CAAGCAC'l'lX; GGAA'l"!' 4796 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) sPvQI)RN(:;r a i ar actrrfstics: 

(A) LENGTH: 924 

(B) TYPE; pnjlcin 

(D) TOPOLOGY; linear 
Cii)MOLCCULCTYPE: piofein 
(iii) HYPCni lETJCA].: no 
(vi) ORIGINAL SOURCE!; 
(A) ORGANISM; human 
(ix) FBATlJRBr T[utiijn&Uii-inlfttac.ljiigpf\itftm 
<xr) &E0UENCb!l>BS0mt''riON: SHQlD NQ: 4: 

Met Ser Arcj Met Trp Giy H.i b U&m Ser Glu Giy Tyr Gly Gin Leu 
1' 5 10 15 



Cye Sear lie Tyr Lhu Lys Leu L,&u ACg Thr: Lys Nmt Glu Tyr His 

20 25 30 





PCT/US99/11743 
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TJir Lys Asn Pro Arg Phe Pro Gly ABn Leu Gin wet Ser Acp Arg 

3& 40 45 

Oln Leu Asp Clu Ala Gly Glu Scr Asp Val Asn Aen Phe Pht? Gin 

EiD 5!-;; 60 

beu Thr Val Glu Met Pbe A»p Tyr T.f!u Gl.u Cys Glu Leu hsn Leu 

65 70 75 

Pho GlTi 'I'hr Val Pi^e Asn Ser Leu Asp I4et Ser Arcr Sor val Ser 

80 B5 90 

Val Thr Alia Ala Gly t51jl Cy£ ArQ Leu Ala Prft l.eu lie Glu Val 

95 100 105 

Lli^ LQU A3P Cys fler His JjSU 'lyr Asp Tyr Thr Val Lys Leu Lgu 

110 115 12D 

Phe Lyt: Ltfu His Ser Cys Leu Pro Ala Asp Thr Lttu Gin Gly Hibi 

125 1^0 135 

Arg Asp Arg Phe Met Glu Gin Phe Thr Lys Leu Lys Asp Leu Phe 

140 145 150 

Tyr Arij Eifer Sisr Asn Lfiu CSln Tyr Phe LyS Ai-jj L«u 11(5 Gill lie 

lt>.ti 16 D 16!> 

Pro Gin Leu Pro Gl.\\ Aen Pro Pro A»n Phe L<ra Ar© Ala Ser Aila 

170 ■l-?^ IBO 

LCD Ser Glu His lie Ser Pro Val Val Val He Pro Ala Glu Ala 

185 19D IBS 

Ser Ser Pro Asp Ser Glw Pro Val Leu Glu Lys Asp Asp Lew Met 

200 205 210 

Asp MoL Asp Ala scr ClJi Gin Asn Leu Pho Asp Asn Lys Phe Asp 

21& 220 22 B 

Asp Il« P}it? Gly Ser Ski Phe Ser Ser A&p Pivo Phe Asm Phe Aan 

330 ^35 240 

f5er Gin Asn Gly val Asn Lys Asp Olu Lys Asp His Leu lie Glu 

2db 2f>0 2!>b 

Arg Leu 'IVr Arg Glu lie Ser Gly Leu Lys Ala Gin Leu Glii Asn 

260 265 270 

Met Lys Thr «lu sor Gin Arg Val val Leu Gin Leu Lys Gly «is 



275 



2 BO 



285 
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val Ser Glu Leu Glu Ala Asp Leu Ala Qlxt Gin Cln Hi a Leu Arg 

290 295 300 

Gl2j C3ln Aia Ala Asp A-rsp Cys Clu Flic Lcii Arg Ala Glu Leu Asp 

305 31D 315 

Glu L«u Aicg Arg Gin Arg Clu Asp ThiL CJlu Lya Ala c;ln Arg ser 

320 32!> 330 

Leu Sor Glu He Glu Arg Lys Al« Gin Ala Acn Glu Gin Arg 'Pyr 

335 340 345 

Ser T.y» Leu Lye Glu Lys Tyr Ser Glu Leu val c;ln Aen His Ala 

J50 5^)5 560 

Asp Len\ Leu Aixj Lyti Aan Ala Glu VaH Th-r T.vs G.ln Vs I Ser Mat 

365 370 375 

Ala Arc} Gin Ala Gin val Asp lcu Clu Arg Glu Lys Lys Cilu Leu 

38!5 390 

Glu Asp Ser Leu Glu Arg Tie Ser Asp Gin Gly Gin Arrr Lys Thr 

395 4 00 405 

Gin Clu Gin Leu Glu Val Leru Glu Ser Le\i Lys Gin Glu Leu Gly 

410 415 420 

'ilir Ser Gin Arg Gl^i Leu Gin val Ltfu Glu Gly Pftr Lt>u Glu Tlu^ 

i30 43 F> 

S&x Ala Cln Ser Glu Ala Asn Trp Ala Ala Glu Plie Ala Glw Leu 

4iD ^45 450 

Glx) Ly» Glu Arg Asp Sor Leu Val Sax Gly AliSt Ala His Ar© Glu 

455 >lfjD 465 

Glu Glu Leu Ser Ala Leu Arg Lys Glu Leru Gin Asp Thm Gliri Leu 

470 475 4B0 

Lys Leu Aly Sei- Tbi- Glu Glu Ser Met CyK Glu Leu Aids LyS Asp 

<18!> 490 495 

Gin Arg Lys Met Leu Leu Val Gly Ser Arg Lye Ala Ala Glu Gin 

500 505 510 

Val He G.ln Abjj Al<a Leu Aaxi Gin Leu Glu Glu Pro Pco Leu lie 



Sib 



520 



525 



Ser Cyn Ala Gly sor Ala Asp His Lcu Leu Ser Thr Val Q^r Ser 

530 536 540 



lie see ser Cya He Glu Gin Leu Glu Lys ser Trp S«x Gin Tyr 

54B 55D E>5!i 

LOU Ala Cys Pro Glxi Akp Tie Sar- Gly hev T^exi Hie Ser He Thr 

560 57U 

H?u Lfeu Ala HiG Leu Thr Sex Asp Ala lie Ala Jlis Gly Ala itix 

575 530 595 

Thr Cys Lmi Ai'g Ala Pro Pro Glu Pro Ala Asp f3er Leu Tjix^ Glu 

590 595 GOO 

Ala Cys Lys Gin Tyr G.1.y Arti Glv TTbr Leu Ala Tyv Leu Ala Ser 

605 610 615 

Leu GLx* Gl" Glu Gly aer lbu cslu Asn Ala Asp Ser Thr Ala MeL 

eao 625 630 

Arg Asn CyjP Lev Ser Lys lie Ly& Ala lie Rly Glu Glu Leu Leu 

635 04(1 0d5 

P*-o AiTQ Gly LGU Asp He Lys Gin Glu Glu Leu Gly Acp Leu Val 

650 655 660 

Acp Lys Glu Met Ala Ala Tl^r ser Ala Ala lie Glu Thr Cys Thr 

665 f>'fO <575 

Ala Arg He Glu Glu Met Leu Ser Lys Ser Arg Ala Gly Aep T}ir 

680 68 & <>9Cl 

Gly Val LyR Leu Glu Val Acn Glu Arg He Leu Arg Cys Cys Thr 

696 700 705 

Spt h&a MeL C3ln Ala He Gin Val Leu He Val Ala Ser J.ye Atjp 

710 715 720 

Leu Gin Arg Glu He? Val Glu Scr Cly Arg Cly 'ilir Ala Ser Pro 

7^5 730 735 

Lys Glu Fhe 'lyr Ala Lys Asti Ser Arg Trp T>iT- Glu Gly Lfeu 3le 

740 745 750 

Ser Ala Ser LyjP AH a V^il Gly Trp Gly Ala Tlir Val MoL val Asp 

765 770 775 

Al^ Al<j Akp Leu Val Val tiln Gly Arcy Gly Lys Phe Glu Giu Leu 

780 7B5 790 

Met Val Cys Ser Hi h Glu lie Ala Ala Sqx Thr Ala Gin Lcu val 

795 BOO G05 



Ala Alii Sf^.r Lys Val Lys Ala Asp i^ys Asp se-r Pro Acii Leu Ala 

810 81S 820 



Gin Leti Gin Gin Ala Scr Arcf Gly val Asn Cln Ala Thr Ala Gly 

625 B30 935 

V&l val Ala Sex Thr lie Ser Gly Lys Ser Glr> He Glu Gl^l Thr 

640 845 8t>0 

Asp Asn Weit-. A6p Pli© acr iJer Mot Thr lg^2 Thr Gin ale Lys Arg 

8.^^ B6D 865 

Gin Glu net Asp Ser Gin Vfil Arg Vfil IxRW Glu Leu Rlu Asn Glu 

870 87b BSO 



Leu C5ln Lys Glu Arg Gin Lys Leu Gly Glu Leu Arg Lys Lys Ki» 

aB5 690 B95 

Tyr Clu Leu Ala Gly Visl Al<;j Gl" Gly Trp Glu Glu fJly Tiu^ Glu 

SOO 90[j 910 

Ala Ser Pro Pro 'llir hem Gin Glu Val Val Thj- Glu T.yF Gin 

915 920 9^4 



(2> JNFORMA'nONPOR SEQ ID NO: 5 
(T) SHODENCE CHARACTERISrrCS: 

(A} LENGTH: 1090 

(B) TYPE: prcCfiin 

(r))Tf3POL0GY: linear 
Cn)MOLECULBTYPE: prousin 
([ii) HYPOHlBTJCAi,: fir, 
(ViJ OR ICilN AL SOUltCE: 

(A) ORGANISM: hunian 

(ix>PEATURB: Hunlinglin-mtovacling prntein 

(xi) SEQUENCE DESCRJPTIOK: StiQ JD NO: 5: 

MeL LDu Leu Cyc Gin Gly Ser Glu Trp Arq Arg Afep Gin Gin Leu 

5 10 15 

Gly Thr- Ala Asn Ala Arg Gin Trp Cys Fro Leu Pro Gin Asp Ala 

20 25 30 

Gin Pro Ala Gly Ser Trp Giu Arg Cye Pro Pro Lftu Pro Pro Ala 

35 40 45 



Gly Arg Leu Gin Gly Thr Aep His Psfo Trp Gly I'rp Gly Arg Leu 

50 bb 60 



Ala Gly Gly Gly Olv Ang Gly Gly Lexi Tip GJxj Gly Leu Ser HIp 

65 70 75 

Stja- Oln Arg Lco lie iiis Leu lie Leu Lou Scr Lgu Pro Leu Ldu 

BO B5 90 

Val Phe Gin Thr Val Sei* He Aan Lys Ala He A&n T>n- Gin Glni 

9!) -J 00 105 

Val Ala Val Lye 0)v I.ys Hie A],? Ar^ Thr Cys He Leu Gly Tlir 

110 115 120 

Kie Bin Glu Lys Gly Ala Cln 'ilir Phc Trp s&r val val Asn Arcj 

12fi 130 

Dcv! Pro Leu Sex Ser floen Ala Va.l Lefu Cys Trp Lys PJie Cyc His 

liO 145 150 

val Phc llis by 5 Leu Leu Arg Asp Gly tiic Pro Asn val Lsu Lys 

155 160 165 

Asp Ser T.eii Arg Tyr Arg AHii Glu Lew Sf?r Aep Met Ser Arg Met 

•J 70 175 180 

Trp Gly His beu Ser Glu Gly 'l-yr Gly Clr^ Leu Cys ser life Tyr 

165 190 19S 

Leu Lys Leu Leu Aig Thi* LyS Met Glia Tyar KiB Hit: I.yp Asn Pro 

20D 20i> i^lO 

Arg Phe Pro Gly Asn Leu Gin Met Ser Asp Arg Gin Leu Asp Gin 

215 220 225 

Ala Gl.v Glu Sei- Aisp val Asn Atn Pli« Phe Glii Leu T>>i- Val Glu 

230 >-35 '-^^^ 

Met Phe Asp Tyr Leu Glu Cys Glu Leu Asn Leu Phe Gin 'I'hr Val 

245 250 255 

Plife Asn ser Leu Asp Met Scr Arg Ser Val Ser- Val T!hr Ala Ala 

260 265 270 

{Sly Gin Cyc Arg Leu Ala Pro Leu lie Gin val 11c Leu Asp Cys 

275 2aS 2S5 



Ser His Leu Tyr Ajpp 

290 

Ser Cys Leu Pro Ala 

305 



Tyr Thr Val Lys Leu 

295 



Asp "lOir Leu Gin Gly 

310 



Leu Phe Lys Leu His 

SOD 

His Arg Asp Arg Phe 

33.!> 
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Mfet Glii Gin Phe a'hr Lys Leu Lys hfs-p Leu Phe Tyr Arg Sex Ser 

320 325 330 

Asn Li^u {Sin Tyr Phe ijys Arg Leu lie Gin lie Pro Gin Leu Pro 

335 340 345 

Glu AB7i Pifj Pro Asn Phe Lou Arg Ala Ser Ala Leu Scr QIm hig 

350 355 360 

He ser Pro val val Val He Pro Ala Glu Ala Ser Ser Pro Asp 

365 370 375 

Ser Gl-o Piu Val Leu Glu Lya Asp Asp lcu Mot Asp MeL Asp Ala 

360 365 390 

Ser Gin Gin Asn Leu Phe A$p A&n hya Phe Asp A^p I]e Phe <3ly 

"JS>5 400 405 

SeT Shx Phe ser Ser Asp Pro Pho Asn Phe Asn scr t:ln Asn Oly 

410 415 42fl 

Val ASD Lyc Asp Gli^ Lya hep Hie Leu T.le Glu Arg Leu Tyr Arg 

425 430 435 

Glu He Ser Gly Leu Lys Ala Gin Leu Glu Asn Met Lys Tlir Glu 

440 445 450 

Ser Gin Arg Val Val Glu Leu Lyjs Gly Kie val 5«r Glu Leu 

455 dCO 465 

Glu Ala Asp LC5U Ala Glu Gin Gin His Leu Arg Gin Gin Ala Ala 

470 475 480 

Aap Asp CyK Glu Pha Leu Arg Ala Glu Leu Asp Glu Leu Ai:g Arg 

i85 430 495 

GlT» Arg Glu Asp Thr Glu Lys Ala Gin Arg Ser Leu Ser Glu He 

600 5D5 510 

Glu Arg Lys Ala Gin Ala A»ij Glu Gin Ary Tyi- S^r: Lyw Len Lys 

515 520 525 

Glu Lys Tyr &or Glu Leu Val Gin Asn His Ala Asp Leu Leu Arg 

530 535 540 

Lys Asn Ala Gl\; Val Thy; Lys Gin Val Stfir Meh Aim Arg Gin Ala 

545 550 555 

Gin Val Asp Leu G]u Are Glu. Lys Lye ulu Leu Glu Asp ser Leu 



560 



56.^ 



570 



wo 
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Olu Arg lie Ser Atp Gin Gly Glv\ Arg Lys Itor Gin GIu Gin Lev 

575 58B 585 

Glu Val L6U CSl'u Ser Leu Lys Gin Glii Leu Ala Thr Scr Gin Arg 

590 Fi9B 600 

Glu Leu Gin Vfil t5ln Gly Sear Iiex> Glu Tin- S«t.- Ala Olii Ser 

60!ti 610 

Glu Ala Asn Trp Ala Ghi Pbe Ala Glu Leu Glu Ly& Glu Arg 

620 625 630 

Asp S^r: li«3u Val Ser Gly Ala Ala His Arg GlM Glu Glu Leu Ser 

e:i5 <idO S46 

Ala LGu Arg Lys Glxi hfsvL Gin Asp Thr Gin Leu Lys Le\i Ala Seir 

650 655 6S0 

Thr Glu Glu Sor Met Cys Gin Leu Ala Lys Acp Gin Arg Lys Met 

665 670 675 

LCU Leu Val Gly 5ex Arg Ly« Alri Al ja Glv GIti Val He Glii A£p 

680 «>85 *^^fJ 

Ala L6U Asn Gin beu Glu Glu Pro Pre L.eu lie Ser Cys Ale Gly 

695 "700 705 

Ser Ala Asp HIb Leu Leu Eier Thr Val Tlir Ser Ilo Ser Ser Cys 

71D 715 720 

He Glu Gin Leu Glu Lys^ Ser Trp Ser Gin Tyr Leu Ala Cys Pro 

725 730 '735 

GlTi Asp rJe Ser Gly Leu Leu His Ser lie Tlir Leu Leu Ala His 

740 Vd5 75D 

Leu Thr Ser Asp Ala lie Aia His Giy AJa Tbr Thr Cyn Leu Ars 

755 760 '765 

Ala Prtjt Pro Glu Pro Ala Asp Bcr Leu Thr Glu Ala Cys Lys Gin 

770 775 780 

Tyr Gly Arg Glu Thr Leu Ala Tyr Leu Ala Ser Lev Gl.u Glu Glu 

7B5 790 795 

Gly Ser Leu Giu Asn Ala Asp Ser Thr Ala MeL Artj ASii cys Leu 



aoo 



805 



810 



ser Lys lie Lys Ala He Gly Giu Glu Leu T..eu Pro Arg Gly Leu 

B15 820 
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Asp He Lys Gin Glu GI12 Ltiu. Gly Asp Leu Val Abj) Lys G.lv M«t 

S30 83b S40 

Ala Ala Thx Ser A1<-j Al.i lie CJlu Thr Al<j Thi Alfi ATg IJe Glu 

8dS StiO 855 

Glu Mst Leu Ser Lys Ser Ar© Ala Gly Asp Thr Gly Val Lys Leu 

860 865 670 

Glu Val Asn Glxi A^rg lit* Leu Gly CyS CyS Thar S*»i' Lru MRt Glil 



Ala lie tiln Val Lev He Vnl Ala Eer Ly« Agp Lex> Gin Arg Gl" 

B90 895 900 

He Val Glu aer Gly Arg Gly 'iThr Ala Ser Pro Lys Glu Plie 'i-y^ 

905 910 915 

Ala Lys Acn 9er Arg Trp 5>ir Glu Gly Leu He S^t A.la Ser Lye 

920 925 930 

Ala V<-il Gly Trp Gly Ala Thr Val Met Val Asp Ala Ala Asp Leu 

935 940 945 

Val Val Gin Gly Arg Gly Lys Fhe Glu Glu Leu Met Val Cys Ser 

950 955 960 

Hip Glu He Ala Ala Ser TJir Ala Gin Leu Va) Ala Ala S<?r Lys 

965 970 975 

Val Lys AJa Af?p Lys Asp Scr Pro Asn Leu Ala Gin Leu Gin Glii 

i?80 S85 590 

Ala Scr Arg Gly Val Abu Glii Ala Tin Ala (3ly Val Val Ala Bav 

995 .1.000 100.5 

Thr He Sen- Gly Lya ser Gin ±le Glu Glu 'i'hr Asp Acn Met Asp 
lOlO 1015 1020 

Phe ser ser Mot iTir Leu Thr Gin He Lye Arg Gin Glu Wet Asp 
1025 1030 1035 

Ser Gin Vul Arg Val Leu Glu Leu Clu Asn Glu Leu Gin Lys Glu 
1040 1045 1050 

hxg Oln Lys LGU Gly Glu Leu Arg Lys Lys Hi^ Tyi- Glu Leu Aiw 



875 



888 



1055 



1060 



10^5 



Gly val Ala Glu Gly Txp Glu Glu Gly Thr Glu Ala Ser Pro Pro 
1070 107S 1080 



Thr Leu Gin Glii Val Val Thx Glu Lyc Glu 
1085 1090 



(2) D«a=ORWIA'l'lON POk SEQ H) N0:6: 
(f) StiOUENCE CHARACTERISTICS: 

(A) LENGTH: 330J 

(B) T?yPB: nucfeicadd 

(C;) STRANDEDNESS: singjc-. 
(D) TOPOLOGY: lioeaf 
Oi)MOLI3ClJ] M TYHt: cDNA 

(iii) IIYPO-J-HBTICAL: iio 

(iv) AN'E'l-SENSEc no 
(vi) ORIGINAL SOURCE: 
{A) ORGANISM: hujiian 

(}x) PEATURH: cDNA forHimiingcm-imcniuUngprulcm 
(jtOSEQUENCE DESCRIPTION: SRQ !D NO: Cr. 

CGC;TGA(5CTC GAGGAGCAGC GGAAGCA{?!AA GCAGAAGGCC CTGGTPGATA 50 

A-TCiAGCACSCT CCGCCACGAG CIXJGCCCAGC TGAGGGCTGC CCAGCTGGAG 100 

CGCGAGCX3GA CTnCAC^GGCCIT GL'GTCAGGAG GCrGAGAGGA AGGCCAGTGC 15[) 

CACGGAGGCG CGCTACAACA AGCTGAAGGA AAACCACAGT GAGCTCGTCC 200 

ATGTGCACGC GGAGCTGCTC AGAAAGAAtTG CGGACAUAGC CAAGCAGC'I'G 250 

AfrUGTGACGC AGCAAAGCCA GGAGGAflQTG GCGCGGOTGA ACCAGCAGCT ZOD 

GGCCTTCCAC C'lSSGAGCAGG TGAAGCGG<3A GTCGGAGTTG AAGOTAGAGC 350 

AGAAGAGCGA CCAGCAGGAG AAGCTCAAGA GGGAGCTGGA CIGCCAAGGCC 400 

GGAGAGCTBG CCCCCGCGCA GGAGGCCCTG AGCCACACAG AGCAOACSCAA 450 

GTCGGAGCTG ACCTCACGGC TOG AC AC ACT GAGTGCGGAG AAGGAI^CTC 500 

l^GAGTGGAGC TC;TCCGGCAG CCGOAGGCAG ACCTGCTGGC GGCGCAGAGC 550 

CTGGTGCGCG AGACAGAGGC G<3C<3C:TGAGC CGGCAGCAGC AGCGCAGCTC 600 

CCAGGAGCAC GGCGAGTTCC AGGGCCGGCT GGCAGAGAGt? GAi^TC'lNiiAGG 650 

AGCAGGGGCT GCCCCAGAGG CTGCTGGACG AGCAGTTCGC AGTGTTCSCGG 700 

GGCGCTGCTG CCGAGGCCGC GGGCAl'CCTG CAGGATGCCG T6AGCAAGCT 750 

GGACGACCCC CTGCACCTGC GCTG'I'ACCAG CTCCCCAGAC TACCTGGTGA 800 

GCAGGGCCCA GG^GGCCTTG CSATGCCGTGA GCACCCl'GGA OGAGGGCCAC 850 

GCCCAGTACC TGACCTCCTT GGCAGACGCC TCCGCCCTCG "I'GGCAGCTCT 900 

GACCCGCTTC TCCCACCTGG CTGCGGATAC CATCATCAAT GGCGG'l'GCCA 950 

CCTCGCACCT GGCTCCCACC GACCCTGCCG ACCGGCTCAT AGACACCTGC 10 DO 

AGGGAGTGCG GGGCCCGGGC TCTGGAGCTC ATGGGGCAGC TGCAGOAGCA 1050 

GCAGGCTCTG CGGCACATGC AGGCCAGCCT GGTGCGGACA CCCCTGCAGG 1100 

GCATCCTTCA GCTGGGCCAA GAACTOAAAC CICAAGAGCC'l' AGATGTGCGG 1150 

CAGGAGGAGC TGGGGGCCGT GGTCGACAAG GAGATQCCGG CCACATCCGC 12 00 

AGCCATTGAA GATGCTG-JGC GGAGGATTGA GGACATGATG AACCAGGCAC 12 50 

GCCAC?GCCAG CTCGGGGGrG AAGCTGGAGG TGAACGAGAG GATCCTCAACI 13 00 

'I'CCTGCACAG ACCTGATGAA GGCTAl'CCGG Cl-CC'lGGTGA CGACATCCAC 1350 

a^GCCTGCAG AAGGARATCG TGGAGAGCGG CAGGGGGGCA GCCACGCAGC 1400 

AGGAATTTTA CGCCAAGAAC TCGCGCTGGA CGGAAGGCCT CATCTCGGCC 1450 

TCCAAGGCTG TGGGCTGGGG AGCCACACAG CTGGTGGAGG CAGCTGACAA 1500 

GGTGGTGCTT CACACGGGCA AGTATGAGGA GCTCATCGTC TGCTCCCACG 1550 

AGATCGCAGC CAGCAtGGCC CAGCTGGTGG CGGCCTCCAA GGTGAAGGCC 1600 
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AACAAGCACA GCCCCCaCCT gagccgcctg caggaatgtt ctcgcacagt 1650 

CAATGAGAGG tJCTGC0AA'l\3 TGGTGgCCTC CACCAAQTCA GGCCAGGAGC IVOD 
AGfATTGAGCA CAGAGACACC ATGOATTI'CT CCGGCCTGTCT CCTCTATCAAG 1750 
CTGAAGAAGL' AGGAGATGGA GACGCACiCTG CGTGTCICTGG ARCTCSGAGAA 1800 
GACGCTGGA<; GCTGAACGCA TGCGCSCTGGG GOAOTTCCGG AAGCAACACT 1850 
ACGTGCTGGC TGGGGCATCA GGCAGCCCTG GJiGAGGAGGT GGCCATCCGG 1900 
CeCAGCACTG CCCCCCGAAG TGTAACCACC AAGAAACCAC CCC'1X3GCCCA 1950 
GAAGCCCAGC G'IGGCCCCCA GACAGGACCA CCAGCTTGAC AAAAAGGATG 2000 
GCATCTACCC AGCTCAACTC GTGAACTACT AGGCCCCCCA GGGGTCCAGC 2 050 
AGGGTGGCTG GTGACAGGCC 'I'GGGCCTCTCJ CAACTGCCCT GAC?AGGACCG 2100 
AGAGGCCTTG CCCCTCCACC TGGTGCCCAA CCCTCCCGCC CCACCGTCTG 2150 
GATCAATGTC CTCAAGGCCC OTGGCCCTTA OTGAGCCTGC AGGGTCCTGG 3200 
GCCATGTGGG TGGTGCTTCT C3GATGTGAGT CTCTTATTTA TCTGCAGAAG 2250 
GAACT1TGGG GTGCAGCCAfi GACCCGGTAG GCCTGAGCCT CAACTCTTCA 2300 
GAAAATAGTG TTTTTAATAT TCCTCTTCAG AAAATAGTGT TTTTTiATATT 2350 
CrGAGCTAGA GCTCTTCTTC CTACGTTTGT AGTCAGCACA CtGGGA^ACC 2400 
GGGCCAGCGT OGGGC'l-CCCT GCCiTCi^jGA CTCCTGAAGG TCGTC5GATGG 2dS0 
ATGGAAGGCA OACAGCCCGT tiCCGGCTGAT GQGACGAGGG TCAGGCATCC 2500 
TGTCTGTGGC CTTCTRGGGC ACCGATTCTA CCAGGCCCTC CAGCTGCGTG 2550 
Gl'CTCCGCAG ACCAGGCTCT GTGTGGGCTA GAGGAATGTC GCCCATTACC 2600 
TCCtCAGGCC CTGGCCCTCG GGCCTCCGTG ATGGGAGCCC CCCAGGAGGG 2700 
CTCAGATGCa- GGAAGGGGCC GCTTTCTGGG GAGTGAGGTG AGACATAGCG 2750 
GCCCAGGGGC TGCCTTCACT CCTGGAGTTT CCATTTCCAG CTGGAATCTG 2800 
CAGCCACCCC CATTTCCTGT TTTCCATTCC CCCGTTCTGG CCGCGGCCCA 2 850 
CTGCCCAGCT GAAGGGGTGG TTTCCAGCCC TCCGGAGAG'l' GGGCTi-GGCC 2900 
CTAGGCCCTC CAGCTCAGCt: AGAAAAAGCC CAGAAACCCA GGTGCTGGAC 29S0 
CAGGGCCCTC AGGGAGGGAC CCTGCGGCTA GAGTGGGCTA GGCCCTGGCT 3000 
TTGCCCG'l'CA GATTTGAACG AATGTGTGTC CCTTGAGCCC AAGGAGAGCG 3050 
GCAGGAGGGG 'fGGGACCAGG CTGGGAGGAC AGAGCCAGCA GCTGCCATGC 3100 
CCTCCTGCTC CCCCCAGCCC AGCCCTAGCC CTTTAGCCTT TCACCCTGTG 3150 
CTCTGGAAAG GCTACCAAA'1' ACTGGCCAAG GTCAGGAGGA GCAAAAATGA 32 00 
GCCAGCACCA GCGCCTTGGC TrrG'iXiTrAG CA'in"iX:;CTCC TGAAGrGl-lC 3250 
TGTTGGCAAT AAAATGCACT TTGACTOTTA AAAAAAAAAA AAAAAAAAAA 3300 
A S3D1 



(2) INFORMATION FOR SEQ ID NO; 7 
(0 SEQUENCE aiARACTRRLSTlOS: 
(A) LENCiTK: 676 
(&) TYPE: protein 
(D) TOPOLOGY: linefjf 
(ii)M01^ECU]^R TYPE: procein 
(iiij HYFO'J'HB'JICAL: no 
(vi) ORrCiTNAL SOURCE: 
(A) ORGANISM^ human 
(ix) FEATURE: Himiingfm-iTifertOtifigptOtejr^ 
(xi) SEQUBNCEDE.SCRIPTION: SEQ lO NO: 7: 

Gly GlU L€U Glu GlU Glr Arg LyS Gin I^yB GJa Tsyp Al;j Ij«»u VaJ 



5 



10 



15 





Asp Asn Glu Gin Leu Arg Bi» Glv lieu Aia Gin Leu Arg Ala Ala 

20 25 30 

GlTi Glu Arg Glu Arg Scr Gin Gly Leu Arcf Glu Glu Ala Glu 

35 40 45 

Arts Lys MfJ Ser Al<=i "Thr Glu Ala Arg Tyr Asn Lye Leu Lys Glu 

5iQ 60 

Lys His Ser Glu Lou Vyl His Val Ki» A^b G.lu Lev Leu Arg Lye 

65 70 75 

Asn Ala Asp Tha: Ala Lys Gin Lou Thr val Thr Cln Cln Scr Gin 

SO 85 90 

Glu Glu Val Ala Arq Val Lye Glu Gin Lena Ala Phe Gli) Val Glu 

95 100 lOrj 

Gin Val Lyc Arg Glu Ser Glu Leu Lye Leu Glu Glu Lys Ser Asp 

110 115 120 

Gin Gin Glu Lys Leu Lyi# Arg Glu Leu Glu Alii Lys Ala Gly Glu 

125 130 155 

Li?»i Ala Arg Ala Gin Glu Ala Leu Ser His Thr Glu G.lu Ser I.y» 

140 145 150 

Ser Gill L«ii sei: sat Arg Leu Asp Thr Leu Ser Ala Glu Lys Asp 

155 160 165 

Ala Leu Ser Gly Ala Val Arg Glu Arg Glxi Ala Asp Leu Leu Ala 

170 175 ISO 

Ala Gin Ser Lbu val Ai'g C5lu Hir Glu Ala Ala Leu Ser Arg Glu 

ISii 190 5 

Gin Gin Arg Ser Ser Gin Glu Gin Gly Glu Leu Gin Oly Arg Lqu 

200 20^ 210 

Ala Glu Ai:g Glu Ker Gin Glu Gin Gly Leu Arg Gin Arg Leu Leu 

215 22D 225 

Aap Glu Gin Phe Ala Val Leu Arg Gly Ala Ala hln Glu AIfi Alii 

230 235 240 

Gly 3le Leu Gin Abp Ala Val ser Lys Lcu Asp Asp Fro Leu His 

2Afj 250 255 

Leu Arg Cys Thr Ser Ser Pro Aep Tyr li«u Val Ser Arg Ala Gin 



26D 



2 65 



270 
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Glu Ale, Lfeu Aap Ala Val E&r Thr Lou Glu G2u GJy His Ala Gin 

275 28B 285 

Vyr Leu 'n:ir iJcr Lcu Ala Asp Ala 3&r Al£> Leu Val Ala Ala Leu 

290 295 30D 

Thr Arg "Phe Ser Hi» Ltiu Ale Als Asp Thr He lie Asn Gly Gly 

305 3i0 315 

M3 Thr Ser His Leu Ala Pro l^x Asp Pre Ala Asp Aicg Leu lie 

320 325 33D 

Asp Thr Cy& Arg CJlu uys C5ly Ala Arg Ala Leu Glu Leu Met Gly 

335 3^10 345 

Gli^i LCU Gin Asp Gin Gin Ala Leu Arg His "Ktet Gin Ala Ser Leu 

350 355 360 

Val Arg Thr Fri> Leu Gin Gly lie L«u Qlii L«>\i Gly Glzi Glu Leu 

36ti 370 375 

Lys Pro Lys Ser Leu Asip Val. Arc? Gin Glu Glu Leu Gly Ala Val 

380 385 31?0 

V£»l Asp Lys Glu Met Ala Ala Tlir Ser Ala Ala lie Glu Acp Ala 

395 400 405 

Val Arg Arg lie Glu A«t) Met Met A&n Gin A]© Arg Hi** Ala Ser 

410 415 420 

5&r Gly Val Lys Leru Glu Val Asn Glu Arg lie Leu Asn Ser Cys 

425 430 435 

Thr Abp Leu Met Lys Ala lie Arg Leu Leu val Tlir Thr Sex Thr 

440 445 450 

Ser Leu Gin Lys Glu lie Val Glu Ser Gly Arg Gly Ala Ala Thr 

455 460 465 

Gin GJ.n Glu Phe Tyx Ala Lys Amx Ser Arg Trp Thr Glu Gly Leu 

470 475 480 

lie Ser Ala Ser Lye Ala Val Gly Txp Gly Ala Thr Gin Leu Val 

485 490 495 

Glu Ala Ala Aep Lya Vul Val Leoi His Thr Gly Iiy» Tyr Glu Glu 



500 



505 



510 



Leu He val Gys ser His Glu lie Ala Ala Ser Thr Ala Gin Leu 

515 520 625 
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Val Al<-3 Ai& Lys Val Lytj hln, A&i] Lyt? Hip Ser pro Hj.B I^^i 



530 



535 



541) 



ser Arg Leu Gin Glu Cys Ser Arg 'i^nr val Acn Glu Arg Ala Ala 



545 



550 



555 



Asji val val Ala Ser Thr i^ys Scr Cly Gin Glu Gin lie clu Asp 



560 



565 



570 



Arg Aep Thr Met Asf Ph** Gly J^eu Leu I.le Lyp liSti I^y« 



575 



pB8 



Lye Gin Glu Met Glu Thr Gin Val Arg Val Leu Clu Lcu Glu Lys 



590 



595 



600 



C Le« Gill Alii Glu Arg M«t Ai-g Leii Gly <3lu L«u Arg Lys Gin 



605 



CJO 



61.^ 



His Tyx Val. T,eu Al;^ Gly A.la Ger Gly Ser Pro Giy Glu Glu Val 



620 



S25 



630 



Ala II Ai:g Px^rj S«ar Hir AIli Dro Aitj Ssr. Viil Thr T>n: LyK TiyR 



6d0 



64!> 



Pro Pro Leu AJ.a Gin Lys P^^o Per Val Ala Pro Arq Gin Aep Hie 



650 



655 



660 



Gin LCU Asp byjs Lys AJsp Gly lie 'l-yr Fro Ala Gin Leu Val Asn 



665 



670 



675 



Tyr 



(2) INFORMATION TOR SEQ ID N0:8: 
(i) SFJQUKNC^B CHARACTBRIS-nCS: 

(A) LENGTH: 2338 

CB> TYPE; nucleic acUl 

(C) STRANDEDNBSS: Rin^^le 

(D) 'nOl'OLOOY: Imear 
(iilMOUECULE TYPE: cDNA 
(tii) HYPOTHEnCAL: no 
(iv) ANTI-SENSE: no 

(Vij ORIGENAL SOURCB: 
(A) ORGANISM: mouse. 

(ix) FEATURE: cDNA JorHim(ingfin-intcrac(ingpi o«ciTi - niHIPl 
(Ai)Sr?QUnNCB DRSnRfPTiON: SHQ ID NO: 8: 

GGCACGAGGG CICATTCAGA TCCCCCAGCT GCCCGAGAAT CCACCCAACTT 50 
CCTACGAGCC TCG6CCCTGT CAGAGCACAT CAGTCCTGTG GTGGTGATCCC lOO 
GGCAGAGGTG TCAIQCCUfiQ ACA<;tGAGCC TC^TCCTCC^AC AAGGA'PGACCT 150 
CAK3GACATG GACGCCTCCC A6CAGACTTT CTTTGACAAC AACTTTCAT^A 200 
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CGTCTTTGGC AOCTCATTCA GCAC5CGACCC 
TGGCGTGAAC AAGC3IACL^A(JA AC3GACCACTT 
GATCAGTGGA CTGACAGGCC AGCT(:3GACAA 
GGCCATGCTO CAGCTC5AAC5G G3TCC5AGTGAlC; 
AGAGCAGCAG CACTTGGGCC CMdAOICJCTAT 

cactgagcto GATGAACTGA agaggcagcg 

GCGCAGCCTG AC'I*GAGATAG AAAGAAAGGC 
TAGGAAGTTA AAAGAGAAQT ACAG^KBAAC'I' 
GCTGCGGAAfl AACGCAtSAGO TGACCAAACA 
CCAGGTCGAT TTGHAAACSAG ACJAAAAAAQA 
GTGTAAGTGA CCAGC^CCCAG CGGAAGAIJTC 
6AGAACCTOA AGCATGAACT C3GC:CACCAGC 
CCACAGCAAC CTGGAAACCT CTRCCCAGTC 
AGATCCCCGA GTTGGAGAAG GAACAAGGCA 
CAGAGAGACG AAGAGTTATC AGCCCOX^CGA 
GATCAAGCTG GCTGtJCGCOC AGGAAIXIJCAT 
AGAGGAAA?vC CrTC?TTC;c50A CCGATCAGGA 
CAGGAGGCGC TOAGCCAGCT TGAGGAACCC 
ATCCACAGAT CACCTTCTHT HCAAAGTCAG 
AGCAACTGGA AJViGAACGGC AGCCAGTATC 
AGTGA<5Cl'lx:: TGCACTCGAT CACCCTX3CTT 
TGTCATCCAC GGGaGTGCCA CCAGCCTCCG 
ACTCGTTGAC CGAGGCCTGT AGGCAGTA1G 
CTGTCCTCCC TGC^AGGAAGA GGGAACTGTO 
CCTTAGGAAT TGCCTCAtiCA CCGTCAAGAC 
CCAGGGGCCT GGACATCAAG CACf<5AAGAGC: 
GAGATGOCAG CCACTTCAGC TGCCATTGPJSi 
GGAAATTCTC: AGTAAGTCCC GAGCAGGAGA 
TGAATGAGAG GATCC'lXiGGT TCCTGTACCA 
GTGCTCGTTG TOGCCTCCAA GGACCTCCAG 
CAGGGGTAGT GHATCCCCTA AAGAATTTTA 
CG6AAGGGCT GATATCCGCC TCCAAACG^rC 
ATGGTGGATG CTGCTGATCT TGTGt^'TCCAA 
GCITGATGG'IG TGTTCACGCG AGATTGCTCC 
CTOCATCCAA GG'lGAAAGCG AACAAGGGCA 
CAGCAGGCCT CTCGAGGAC'l' GAACCAGGCC 
AACCATTTCT GGCAAATCTC AGAT'JGAGGA 
CAAGCATGAC ACTG.^CCCAG ATCIAAGCGCC: 
AGGGTGCTGG AGCTGGAAAA TQACCTGCAC^ 
AGAGCTACGG AAGAATVCACT ACGAGCTOGA 
AGGAAGGGAC AfiAAGCATCA CCGTCTftCTG 
AAAGAGTAGA GCCAAGCCGA CACCCCACAC 




rCTA]S99/l 1743 



TTTCAATTTC AACAATCAAAA 250 
GATTGAACGC CTGTACAGAGA :iO0 
CATGAAGATT GAQAGCCTAGCG iiSO 
TGAGCTGGAG GCAGAGCTAGC 400 
GGATGACTGC GAGTTCCTGCG iiiO 
AGAGGACACG GAGAAGGCACA 5 00 
CCAGGCTAAT GAACAGAGGTA 5 50 
GGTGCAOAAC CATGCTGACCT 600 
GGTCTCCGTO; GCCCGCCAACC 650 
GCTAGCAGAT TCCTTTGCAC: 70 D 
AAGAGCAACA GGATGTTCTA 7SD 
AGACAGGAGC TGCAGGTCCT 80 D 
AGAAHTOAAA TGGCTGACAC Qhf) 
GCTTGGCGAC TGTTGCAGCT 900 
GACCAGCTGG AAAGCACCCA 95 0 
GTCCCAGCAG Cl^^AAGGACC 10 DO 
ACCCTGCGGA GCGTGaGATA 1050 
ACCCTCATCA GCTGTGfTAGG 1100 
CTHCGTTTCC AGCTGC^TTG 1150 
TOGCCTGCCC AGAAGAT^TT 12 00 
GCCCftCTTGA CCGGTGACP^C 12 bO 
GGCCCCACCG GAGCCAGCCG 1300 
GCAGAGAAAC CCTGGCCTAT 1350 
GAGAATGC'IXS ACGTCACAGC 1400 
CCT'i'GGCGAG CAGC'i^CTGC 1450 
TGGt^TCSACtZT GGTGGACAAG 1500 
GCTGCHACCA HCCXSGATAGA 1550 
CACGGGAGTC AAGCTCGAGG 1600 
GCCTOATGCA GGCCATdAAC; 16B0 
AAGGAGATAS TGGAGAGTGG 1700 
CGCGAAGAAC TCTCGGTGGA 171^0 
TlVSGT'i^GGG AGCrACCATC 180U 
GGCAAAGGCA AGT'I^IGAGGA 1850 
CAGTACTGCC: CAGL'TCGTCG 190D 
GCCTCAATCT GACCCAGGTG 2 GOD 
ACAGCCGCTG TGGTCGCCTC aOSO 
AACA6ACAGT ATGGACTTCT 2100 
ACSGAGATGGA TTCCCAGGW 2150 
AAGOAGCGTt: AGAAACTAOG 2200 
GGGCGTGGCT GARGGCTGGG 225 0 
TCCAAGAAGC AATACCGGAG 2300 
ATCAGAAA 233 g 



(2) INFORMATION FOR SBQ II> NO: 9: 
(i) SEQUENCE CHAR ACTEEUSTl CIS: 
(A) LENGTH; 676 

CB)TYPr^T>roWi" 

CD) TOPOLOGY: linear 

Cci)MOLECULBTYPB: protein 
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(lit) HYPOTJJFrTICAj.i no 

(vij ORIGINAL SOURCE: 

(A) ORGANISM: momc 

(in,) PBATURB: TTAii)iiiigLiTi-jti»rac(ing pioiem 

(jii) SEQOHNCBDHSeiiU'ilON: SKQ ID NO: 9r 

Ala Aig Gly Ltsu lie Clji lie Pro Gin Leu Pl'o Glu Asn Pro Pro 

5 10 15 

Asn Phe Leu Arg Al SeT- Ala T-fjiJ Ser Glu Hi.B Tie Ser Pro V&l 

20 30 

Va] Val II Pro Ala Glu Veil Ser set Pro Asp Scr Glu Pro Val 

3b ^5 

Leu Glu Lys Ajpp Asp T.eii Met App Met Abp Ala Ser Gin <3ln Tlir 

50 <'>^ 

Lcu Phe Adp Asn Lys Phe Asp A&p Val Phe Gly Ser Ser Leu Ser 

65 70 75 

Ser Asp Pro Phe Abh P}it4 Aen Asn G1t» Abti Gly Val Aaii Lyti Aep 

80 S5 ^0 

Glu Lys Asp llis Lev! lie Glu Arg Leu Tyr Airrg Glx) T.le Srt Gly 

95 100 105 

Leu Thr Gly GIjj lg>u Asp Asn Wet Lyc lie Glu Ser Gin Arg Ala 

110 115 120 

Met Leu Gin Leu Lys Gly Arq Va}. Ser G\u L«u Glu Ala Glu Leu 

125 130 lab 

Ala Glu Gin Gin Kis Leu Rly Ai^a Gin Ala MCrL Asp ASp Cyc Glu 

140 1^5 150 

Phe Leii Arg Thr Glu i^eu Asp Glu Leu Lys Ar-y Gin Airg Glu Asp 

155 ItO 165 

Thr Glu Lyis Ala c;ln Arg Scr Lcu Thr Glu lie Glu Arg Lys Ala 

170 1'75 IBO 

Gin Ala Asn Glu Gin Arg T^r Ser Lys Leu Lys Glxi Lyn Tyr S&x- 

165 190 195 

Glv. Leu V^il Gin Arh Kifc Ala Acp Leu Leu Arg Lys Asn Ala Glu 

200 205 210 

val Thr Lys Gin Val Ser Val A3a Arg Gin Ala Gin V£»l Aep Leu 

215 22D 225 

Glu Arg Glu Lys Lys Giu Lex) Ala Akp Ser PYm Ala Arg Val Ser 





230 



23 b 



Asp Gin Ma Gin Aro Lys Thr Gin liiu Gin C3lii AaE> 1 T.enj GIxj 

245 2i>0 255 

Asn Lyis His Glu Lfeu Al^j TIit- Ser Arg Gin Gli3 Le^i Gin Val 

260 265 270 

Leu }Jis Ser Asn Leu Gl\i Thr &'er Ala Cln ^ac <31m Alij Lys Trp 

275 2&!> 

Lfiiu Thr Gin llo Ala Glii Lfiu Glu Lya (?lii Gin Gly Ser Leu Ala 

290 29t> 300 

TJjr V©3 ATra Alia Gin Arg Glij Gl.u Glu Leu iSer Ala Lou Arcf Asp 

305 310 1^15 

Gin Lt?u Glu £i<i!x lOix Gin lie Lya Ij«u A] 51 Gly ftia Gin Glu Ser 

320 VA^ 330 

Met CyK Gill Glii Val Lyt Aap Gl" Arq Lys Thr Leu Leu Ala Gly 

33!> 340 345 

lie Arc? Lys Ala Ala Glu Arg Glu lie Gin Glu Ala Leu Ser Gin 

350 355 360 

Leu Glu Gill Piw Thr L«>u He Ser Cys Aln Gly Ser Thr App Hi k 

36LS 370 375 

Leu Leu Ser Lys Val Ser Ser Val Ser Ser Cys Leu Glu Gin IjCU 

380 3BS 390 

Glu Lys Asn Cly Sor Cln 'lyr Leu Ala L'ys Pro Glu Asp 31g Eier 

400 

Glu Leu Leu His Ser lie Thr Leix Leu Ala liis Leu Thr Gly Asp 

410 415 420 

Thr val He Glii Gly Ser Ala Thr Ker- L«ii Arg Alri Pxo PxiJ Glu 

i2!5 430 435 

Pro Ala A»p Ser Leu Thr GU) Ala Cys Arg Gin Tyr Gly Arg Glu 

440 445 450 

Thr Leu AIa Tyx^ Leu Ser Her Leu Glu RIm Glu G3y Tlii:- Vul Glu 



J^n Ala A£p val Thr Ala Leu Arg Asn Can Leu R&r Arcj v^il Lys 



460 



465 



470 



47t> 



480 



Thx Leia Gly Glu Clu L<2u l^eu Pro Arg Gly Lou Asp lie hys Gin 

48b. 495 

Glu (Jlu LQu Gly ADp Leu Val ftfjp i.ye G.lv w^t Al<? Alfi Thi- f>si: 

500 50 D 

Al0 Alij lie t:lu Ala ftla Tlir Thr Arg He Glv Ghi Tl« T.eu Sejr 

515 520 

Lvs Ser Arg Ala GH.y Aajj Thr Gly Val Lys Lcu Ciu Val Asn Glu 

530 535 540 

Arg lie Leu C5ly Ser Cys Thr Ser T^eu Met. GJn Ala Tip Lyn Vfil 

B45 550 555 

Leu Val Val f5«sr Lys Asp Lcu Gin bye Glu lie Val Glu Ser 

!j60 565 57 0 

Gly Arg Gly Ser h'\-o S«r Pro Lye Hlu Phe Tyr Ala Lys Asn Scr 

575 5sa r^e5 

Arg Trp Thr Glv Gly Leu llo Scr Ala Ser Lys Ala Val Gly Trp 

>yjO 595 600 

Gly Ala Thr lie Met V<-il Akp AliS hla Aiip Li^u Val val Gin Gly 

605 

Lyb Gly Lys Phc Glu Glu h&x Met Visl Cyn Sei- Ar'y cau lie Ala 

S20 «i^t'-'' "530 

Ale Ser 'lOir Ala Gin Leu Val Ala Ala Sor Lye Val Lys Ala Asn 

655 6d0 645 

Lvp Gly L4^u Asn lcu 'lOir Gin Levi GJ.71 Glu Ala P«r Axg Gly 

65D &55 

Val ABn Gl.n Ala Thr Ala Ala Val val Ala Ser Thr He Ser G]y 

6G5 670 675 

Ly*5 Seir Gin lie Glu Glu Thr Ajpp Ser Met Asp Phts pfei- Ser Met 

(5ft0 685 5D0 

Thr Leu Thr Gin He Lvb Arg Gin Glu Me:t Asp Scr Gin Val Arg 

6B5 7 0(1 "705 

Val Leu GliJ Leu Glu Asn A£5p Leu Gin Ly» Ghi Arq Gin bvK Li?u 

710 715 

Gly Glu Leu Arg Lys T..y» His Tyr Glu Lcu Glu Gly Val Ala Glu 

725 V30 735 
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G3.y Trp Gl« filu C¥.l.y Tlir 61\i Als Ser Pro Ser Thr V£il Gin Rlu 

ViO 745 V!jO 

Ala lAe Pro App Lys Glu 

755 



(2) INFORMATION FOR SEQ ID NO: 10: 
(L) SDQUBNCB C;r FAR ACTBRISTTCS: 

(A}UiNGTH: 3^64 

CB) TYPE: nuduic *dt} 

(C) STRANDBDNESS: ftmglii 

(D}'|TOPOLOGY: linear 
(tj)MOLBClJLETYPE: cDNA 
(tii) HYPOTHETCCAL; no 
(ivl ANTT-SHNSR: no 
(vs) ORIGINAL SOUliOE: 

(A) ORGANISM: mouse 

(u)FDATURn: liDNA JOTHunCingUn-intcracting protein - niHlPla 
(xi)SB0DHN[CEDliS'CKlPTlON: SRQIDNO: 10: 

GGCACfSACSOn ORCRCGCGGC CTCCGTGTGC C'i'AGGC'l"iXiA GGCGGGCGGT 50 

GACGCCTCAT TCGCGCGG^^G CCGGGCCGGG ACACGGTCGC CGGCAGCATG IDO 

AACAGCATCA AGAATGTGCC GGCGCGGGUC CTGAGCCGCA GGCCGCiGCCA 150 

CAGCCTAGAG GCCGAGCGCG AGCAG'I'TCGA CAAGACCJCAG RCnATCAGTA 200 

TCAGCAAACC CATCAACAGC CACi<3AGGCC€ CAGTGftAGGA GAAGCAT6CC 250 

CGGCGTATCA TCCTGGGCAC? GHATCATGAG AAGGGAGCCT TCACCTTCTG 3 DO 

GTCCTATGrn ATCGGCCTGC CGCTGTCCAG CAGCTCCATC CTCAGCTGGA 350 

AGTTCTGTCA CGTCCTTCAC AAGGTCCTCC GGGACGGACA CCCCAACGTC 400 

CTGCATGACT ATCAGCGGTA CCGGAGCAAC A'J'ACCTGAGA 'ICGG'I'GACYT 450 

GTGGGGCCAC CTTCGTGACC AGTA'PGGACA CCTGGTC5AAT ATCTATACCA 5 DO 

AACTGTTGCT G/iCTAAGA'iC 'l'CC'i"I«CCACC TTAAt3C!ACK:C CCAfSTTTCCT 550 

GCAGGCC'IGG AGGTAACAGA TGACGTGTTO GAGAAGGCGG nGGGAAHTGA 600 

TGTCAACAAC ATTTTTCAGC TTACCGTQGA GATGTTTGAC TACATGGACT 650 

GTGAACTGAA GCTTTCTG^G TCAGTTTTCC GGCAGCTCAft CACGGCCATC 7 DO 

GCAGTGTCCC AGATGTCTTC TGGCCAG'i'G'r CGCCTAGCGC CGCTCATGCA 750 

GGTCATTCAG GACTGCAGCC ACC'I-CTACCA CTACAC^ARTR AAGCTC:ATCT 800 

TTAAGCI-GCA C'1K:CTGTCTC: CCCJGCAGACA CCCTGCAAGG CCACAGGGAT 850 

CGGTTCCACG ACCAGTTCC:A CAGCCTCAAA AACTTCTTCC GCCGGGCTTC 900 

AGACATGCTG TACTTCAAGA GGCTCATCCA GATCCCGCGG CTGCCTGAGG 950 

GACCCCCCAA TTTCCTGCGG GCTTCAGCCC TGGCTGAGCA CATCAAGCCG 1000 

GTGGTGGTGA TTCCCGAGGA GGCCCCAGAG GAAGAGGAGC C'I"GAGAACC'J' 1D50 

AATTGAAATC AGCAGTGCGC CCCCTGCTGG GGACCCAGTG GTGGTCGCTG 1100 

ACCTCTTIGA 'iY^AGACCTTT GGAtXM^CCCA ATGGCTCCAT GAAGGATGAC llbO 

AGGGACCTCC: AAATCGAGAA CTTGAftGAGA GAGGTGGAGA CCCTCCGTGC 1200 

TGAGCrGGAG AAGATTAAGA TGGAGGCACA GCCCI'ACATC 'ICCCAGCTGA 1250 

AGGGCCAGGT GAATGGCCTG GAGGCAGAGC TGGAGGAGCA GCGCAACCAG 1300 

AAGCAGAAGG CCC'JXSGTGCA L'AACGAOCAG HTGCGCCACG AGCTGGCCCA 13b>0 

GCTCAAGGCC CTGCAGCTGG AGGGCGCCCG CAACC'AGGGC CTTCGAGP.GR 1400 

AAGCAGAGAG GAAGGCCAGT GCCACGC5AGG CACGCTACAG CAAGCTGAAG 1450 

GAGAAACACA GCGAAQTCAT TAACACGCAH GCCGAGCTGC TCAGRAAGAA 1500 



wo 

CGCAGACACG GCCAAGCAGC 
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TGGCACGGG'l' 
GAGTCTGAGA 
GAGGGAGHTC; 
TGAGCCGCAC 
C'1^AACGCG<3 
AGAGCTGCTG 
GCCAftGAGCA 
CTGGCAGAAA 
TGAGCAG'iM'G 
TACAGGATGC 
AGCTCCCCAG 
GAGCGGCCTG 
CTTC'i^jCCCT 
ACCA'ri'GTCA 
CGACCGCC'l'G 
TGGTGGGACA 
CTGATCCGGG 
GCCTAAGAGC 
AGGAGATGGC 
GAGGACATGA 
GGTGAATGAG 
GGCTCCTGG'J' 
GGCAGGC3GGC 
GACTGAAGGL' 
AGCTGGTGGA 
GAACTCATCG 
GGCAGCCTCG 
TGCAGGAATG 
TCCACCAAAT 
CTCTGGt'.L'TC 
TGCGAGTCTT 
GGGGAGC7TC 
TAGCGAAGAA 
CTAAGAAGCC 
AACCAGCTCGA 
TAGGCCCCTAA 
TC3TGGCTGTC1' 
CCCAAGGGGCC 
CTTAAGTAGGA 
AA'PCTGCAGCG 
TTCGTAAGTTl' 
CCTTGTCTCTG 
GGTCCCTGCTC 
GGGAGftGCAGG 

gcatccatgct 
tgatcgtggct 
tgaagccaccc 

GCTQTTC^ATTG 
GCCTTTCCTCG 



AAAGGAACAG 
'1X3AAGATGGA 
GCGGCCAGGG 
AGAACACAGl' 

agaacsgaacc 

gccgctcaga 
gcagcg6agc 
aggagtctca 
gcggtgttgc 

AGTGAGCAAG 
ACTAOrrGGT 
GAGC:AGC3GCC 
GGTGGCAGC^R 
ATGGTGCCflC 
ATGGACACAT 
GCTGCAAGAC 
CCCCCCTGCA 
C'l-GGATGTAC 
GGt:(?A<:CTCG 
TGAGCt7AGGf: 
AGGATCCTCA 
GATGACCTCC 



TGACAGTGAC 

AGAGCAGAGC 
CACCAGAGCr 
GGC'l-CAGAGC 
CCTGAOTGGA 
GCCTCIGTCCC 
TCCCAGGAGA 
GGAGCAG(?GC; 
GAAGTGCAGC 
CTGGACGACC 
GAGCCGGGCl' 
ACACCCAGTA 
CTGACCCGCT 
GACCTCCCAC: 
GCAGGGAGTG 
CAGACAGTGC: 
GGGCATTCTG 
GGCAAGAGGA 
CCAGCCA'1"TG 
CCGCCACGAC 
ACTCCTQC^At: 
ACCAGCGTGC 



CAGCAACGCA GCAGGAATTT 
CrCAl'CTCAG CCTCTAAGGC 



GTCACCrGAC 
TCTGCTCCCA 
AAGGTGAAAG 
TTCCCGCACJT 
CTGGCCAGGA 
'I'CCCTCATCA 
G<3ACSCTGGAG 
GGAAACAGCA 
GAACCCAGCA 
ACCGCTGGCC 
CAAAAAGGAT 
GGTGTTCAGC 
GGCAGTGGTC 
TAGTC'I"GTCG 
ACTCCCTCGA 
GACATCAGAt? 
AGTCAGCACA 
GACTCAAAAG 
GCTACCAGGG 
TAAGCTGGCr 
GGGAGCCCCA 
GGGGCAGCCC 
TCCCCCC6TA 
CCTACCTTGA 
TGCC 



/iAGGTTGTGC 
TCAGATTGC5G 
CCAACAAGAA 
GTCAACGAGA 
GCAGiATTGAG 
AGTTGAAGAA 
AAGACACTAG 
CTATGTACTG 
GACCCAGCCC 
CAGAAACCCA 
GGTGTGTACL' 
AGGATGGCTt; 



ACAGCAGAGC 
AGATGGAGCA 

GACCAG'I-l'GG 
GGCCCGlXiiCC 

•i'Cagc'ix::acg 
gtcgttcggc 

GGAGAAGGAG 
AGGGCGAGCT 
C'rrCGGCAC?A 
CGCCGAGGCA 
CCCTGCACCT 
CAGGCAGCCC 
CC'l-GGCrTCC 
TCTCCCATTT 
CTGGGCCCCA 
TGGAGCCCGG 
TACGGAGGGC 
CAGTTGGGCC 
GCTAGGGGCC 
AGGACGCTG'l' 
AGCTCAGGCG 
AGACCTGATG 
AGAAGGAAAT 
TATGCC!AAGA 
AGTGGGCTGG 
TTCACATGGG 
GCCAGCACGG 
CAGTCCCCAC 
GGGCTGCCAA 
GACAGAGACZA 
GCAGGAGATG 
AGGCAGAGCG 
GCTG6GGGGA 
AGCTCCCCGA 
GCATACCeCO 
CAGtJTCAACT 
GTGGTTOTGC: 



AAGGGGCCTC 
GACAGTPCAT 
CCAGC'lGGGA 
ATAGTCTGAA 
CTGGGAAAAG 
TCTGAGGCCT 
ATAAGGGGAT 
GGa'G'PCATCA 
GCAGaCCAGG 
UTGC'l'CAGCT 
CAGTTTTCCA 
TGAGTAGATT 



TGAGAAGCCT 

ctggatgtga 
cccagcaggc 
tgctgcgagg 

GTCACATAAG 
TAAGTGAACA 
GACCTOTGAC 
CCTGGGGGCC 
Cr'ri'Gl'GTGO 
CCTG'IKITC'I'C 
TTCTCHTGCn 
TCAGCCCTCC 



CAGGAGGAGG 
AGCGAAGCGT 
AGAAGCTC/sA 
CAGCAGGCCC 
GCTCGACACA 
AGCGTGAGGG 
GAGGCGCTTA 
ACGGGGGCAG 
AGCTGGTGGA 
GAGGCCATCC 
CCGCTGCACC 
'I'GGACAGCGa' 
TCCGAAGA'l'G 
tSGCTGtlGGAC 

c:cGACCc:c>Gc: 

GCTCTGGAGC 
TCAGCCCAGC 
AGGACTTGAA 
ATGGTGGACA 
CCCGAGGA'IC 
TGAAACTGGA 
AAGGCTATCt? 
TGTGGAGAG<: 
ATTCACGGTG 
GGAGCCAGAn 
CAAATACfGAG 
CC:CAGCTGGT 
TTGAGCCGCC 
CGTCGTGGCC 
CCATGGA'I*i'T 
GAC^ACACAGC 
TGTCCGGGTC 
TGGGAACACC 
AGTGGGGCCA 
CAGGACAGAC 
TCTGAACTAC 
CrTGCCCT'l'CA 
CCAACTCCTC 
ATCTATTTAT 
C'iXSAGCCACA 
TAT'IM'CTTTC 
CCAGC5AGCCT 
ACAGAAAGAG 
CCTTGAGCCA 
TGGTGCTAGG 
GAGCCTGGCA 
CCCGTGACCT 
TACTAGTGTG 
TAAAGCTC3GG 



1600 

1650 
17D0 
1750 
1900 
1350 
1900 
1950 
2000 

2100 
2150 
2200 
2250 
2300 

2450 
250O 
2550 
260D 
2650 
270D 
2750 
2800 
2850 
2900 

3000 
3050 
3100 
3150 

3:^00 

3300 
3350 
3400 
3450 
3500 
355 0 
3600 
3650 
3700 
37 50 
3&00 
3tt50 
3900 
3B50 
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(2) INFif)RMA'nON K>R SfcJQ ID NO: [ If 
(t) SEQUB>5CE CHARACTERISTICS: 

(A) LENGTH; 676 

(B) TYPF-: i>fotein 

Cnj TX)POUXjY: Jinear 
(ii)MOLECULB TYPE: protein 
(in) HYPOTJrmTFCAJ nci 
(vf) ORIG[NAL SOUKCH: 

(A} ORGANISM: mouse 

(Ia) FEATURR: llmitmgr.in-ihtfi.ractiTig prtiiein -mHIPla 
(Jtsj SEiQUHNCBDESCHn^TON: SEQ ID NOrl 1: 

Met Asn Ser lie LyB Aen Val Pro Ala Arg Val Leu Ser Arg Arg 

5 ID I'j 

Ptp Gly Hi a S&r Leu Glii Alu Glu Arg GDu Gin Pfw A&p Lyjs Thr 

2D /!t> 30 

Gin Ala I.lt? Il« St?r Ly» A] a He As^m Seir GJx) GU) Ala Pro 

3!> 4U 45 

Val Lys Oiu by 5 Jlis Ala Arg Arg ilo lie Lcu Cly "irhr His Kis 

SO 55 60 

Glxi Lys Gly Al^i PHk Thx- Phe Ttj) !>«?t Tyr Ala G.ly I.jKIi Pro 

65 7tl 75 

Leu Ser Ser Ser Ser lie Leu Ser 'I'rp iiys Phe Cys His Val Leu 

BO 85 90 

HiC£ Ly£f ViStl li&u Arg Asp (2ly HiiS Pro A£Jl Val LC:U His Asp Tyr 

Gin Arg I'yr Arg Scr Asn lie Arg Clu He Gly Asp beu 'i'rp Gly 

110 115 120 

Hiri Leii Aiu AHp C?lii Tyr Gly Ki« Leu Wnl Asiii II tf TyV TllT Lyfe 

IJi'ci 130 1S5 

Leu Leu Leu Thr Lys lie Ser Phe His Leu Lys His Pro Gin Phe 

140 145 150 

Pro Ala Gly Leu Glu Val Thr Asp G]u Vol J,«ii Gin LyK Alfj> Ala 

15S 160 165 

Gly VaI ah^ Asn ili^ Ph£i Gin hau. Thr val Clu McL Phe 

Iff) 175 180 



Asp Tyr Met". Asp Cye GIm Leu 7iYP fStfr Glu Sfcr Val Pins Arg 
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ass 



190 



195 



GIti L&u Asn Thr Ala 11© Ala Val R&i- Qlii Kaf. Ser 5(51- Gly Gin 

200 2D& a1 0 

Cye A.irg Leu Alii Px'^o Leu 11& Gin Val Tie Gin hsp Cys Ser His 

215 220 22.5 

Leu Tyr His Tyr Thr Vr?! T^ys T.eu Met Phe Lys Leu His Ser Cys 

230 235 240 

Leru Ptu Ala Asp Th£ Leu CJln Gly HIk Aa tj Ahj) Artj Ph*5 Hie Gl » 

2d5 250 :i55 

GlTi Phe His Sex* Le» LvF' Ajejti P>ii= P/ns Arg Arg Ala Ser Asp Met 

260 265 270 

Lfeu Tyr Phc Lys Arg Leu He Gin Ho Pro Arg Lgu Pro Glu Gly 



Pro Pro Asn Phe Tj^u Are Aln Ser- AI3 ^'*?v hia GJ.u Hie He Lys 

290 295 300 

Pro Val val val He Pro Glu Glu Ala Pro Clu Glu Glu Clu Pro 

305 310 315 

Glu A&n Lt»u 3lc Clu Ho Ser Ser Ala Pro Pro Ala Uly Clu Pro 

320 125 33 0 

Val val val Ala Asp Leu Pbe Asp Glu Thr Phe G.ly Pro Pro Asn 

335 340 345 

GJy i>«3T- Kut. Ixyn Asp Asp Arg Ai^p L£.>u Qlti lie Glu Aen Leu Lys 

3S0 3fiS 360 

Arg Glu Val G.h) 7hr: Leu Ar g Ala G-l « I^eu Gil u l.yn lie Lys Met. 

36fi 370 37 !> 

Glu Ala ttln Arg "J^r He Ser Gin Lou Lys Gly Gin val Asn Gly 

380 385 390 

Lou Glu Ala Giu Leia Glu Glu G.ln Arg Lys Gin Lys Glxi Lys Ala 

395 400 405 

Lesu Miil Aap Asn c^lu Cln Lou Arg His Glu Lcu Ala Cln Leu Lys 

410 415 420 

Ala Leu Gin Leu Glu G.ly AJtJ Arg Ajpn Gin Gly Leu Arg Glx> Glu 



275 



2B8 



28S 



425 



430 



435 



Ala Glu Arg Lys Ala S«r Ala Thi- Gl\i Ala Arg Tyr Hex- LyB Leu 



wo 99m9V6 
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440 



^45 



4 50 



Lys GIai LyH HiK Ser Glii L«u lit? A»n Thr Hie Ala Giu ijeu Leu 

dSh 460 4B5 

Arg Lys Asn Ala Acp Thr Ala Lys Gin l.cu l*hx val Thr Gin Gin 

470 475 480 

Ser Gin <3lu Glu V£il Ala fii^Q Val Lyw Gin Gin T.exi A.la Phe Gin 

dSB 4^0 495 

Met Gin GJn Ala I^vb Arg Glu S<3r Glu Met Lys Met Glu Glu Gin 

500 505 510 

Sei" Asp Gin Ijcu Glu Lys Leu liys Arg Glu Leu Ala A1<j Artj Ala 

F>1S b20 525 

Gly Glu Leu Kla Arg AIh Gin G.ln ftia Leu Ser Arg 'I'hr Glu Gin 

530 535 540 

Ser Cly Ser Glu Leu Ser fler hxcf Leu Asp 'i'hr l»cu Asn Ala Glu 

545 550 555 

LyB Glu Ala Lt?a Ser Gly Val Val Ai'iJ Gin ArtJ Gin Al^j G.lu T.eu 

1.60 iiGb 570 

Leu Ala Aia Gin Ser Leu Val Arg Glu Lys Glu Glu Ala Leu Ser 

575 583 5B5 

G:in Glu Gin Gljj AiQ scr Scr Gin Glu Lys Cly Clu Lou Arg Gly 

590 595 600 

Gin Leu Ala Glu Tiy» Glu ir>«r Gin Glu Gin Gly Levi Axti GIti I.ys 

605 610 615 

Leu L*ju AKp Glu Gin LCu Ala Val LG>u ArCf £0€n* Ala Ala Ala Glu 

620 625 <>aO 

Ala Glu Ala lie Lev Gin Abp Ala Val Ser Lys Leu Asp Asp Pro 

635 640 645 

Leu His Leu Arg Cys Thr Ser Ser Pro Asp I'yr Lcu val sor Arg 

650 655 660 

Ala Gin Ala Ala beu Asp Ser Val Ser Gly Leu Glu Gin Gly His 

665 670 675 

Thr Gin T/r lieu Al/j Sfex Ear <3lu Aep Altj Shj- a1« Lku V^jI Alrj 

680 68b 690 

Ala Leu Tlir Arg Phe S«r Kih Ltns Alfi Al.a Asp Thi- He Val Asn 
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695 



700 



705 



Gly Ala Aia liar Sex His Lea Ala Pro nir Asp Pro Ala Asp Arg 

710 71FJ 72 0 

hexi Mi?t. Akip Thr CyS Arcj Glu C.y& C3ly AI<j Arg Ali=i Leu CSlu L^u 

725 V30 735 

Val Gly Gin Leu G3ti A»p G.ln Tbr- Val Leu Arq Arij Ala Gin Pro 

740 745 750 

Ser Lexi Wi?t Ax'3 Ala Pro I^eu Gin (Sly Jl& Leu Glii Lou Cly Gin 

755 760 765 

Asp Leu hys Pto Ly» Tifiu A»p Val A.Tg filri CSlu fJlii LtfU Gly 

770 775 VeO 

Ala MeL Val Acp Lys Glu Met Ale? Ala Thx fler Ala Aia He Glu 

7B5 790 795 

Asp Ala Vai Ar« Arg lie Glxi Afcp Mtat. ME^e. Hex' Gin Ala Arg His 

800 80& BIQ 

Glu Kbl' Ser Gly Val Lys Leu Glu Val Asn Glu Arej Tie? Leu Asn 

815 820 e2h 

Ser CvK Thr Asp Le-u Met Lyc Ala lie Arg Leu Leu Vet I M«t-- Thr 

8 JO 835 840 

S^r- Tlir scr Leu Gin Lys Giu Tl« Val Glu Ser t3ly Arg Cly Ala 

S45 iJ50 855 

Ala Thr Glu Glii Glu Phe Tyr Ala Jjys Ann Ser Arg Trp Tbr Glu 

860 B65 870 

Gly Le\i He Ser Ala Rrt l.ya Al<a Val Gly Trp Gly Ala Thr Gin 

875 SSR 8B5 

Leu ViJl Glu ser Alfis Asp Lyc Val Val Leu Hijs Met Gly T.ys Tyx- 

890 695 900 

Glu Glu LGU He Val Cys Ser Kis Glu l^t?. Alfj Ala Sfer That Ala 

905 910 915 

Gin Leu V«l Alu Ala Ser Lys Val Lys Ala Asn Lys Asn Ser Prn 

9^a 925 S30 

His Lea Scr Arg Leu Gin Glu Cys Ser Kvij Tbr Vul A^n Glu Arg 



935 



9d5 



Ala Ala Asn Val Val Ala Ser Thr Lys sor Gly Gin Glu Gin He 



$>5D 955 960 

C3lu Asp Arg Asp tHt M£?t Asp Phe Sier Gly Lou sor Loia He Lys 

965 <370 975 

Leu LyjEs T.ys «ln Giix Met Gla Thr Gin Val Arg Vai T^v Glu hau. 

980 98S 99(i 

Glu ijyo Thr T.i*ni C3lu Ala Glu Arg Val Axg Leu Gly Glu Leu Ai-?? 

HOD 1105 

Lye Gin His 'ivr Val T.«u Ala Gly Gly MeL Gly Thr Pro Scr Glu 
1110 1115 1120 

Glu Glu i'TO Sar Arg Pro Ser Pro his Pro Arg Ser Giy Ma Tlir 
n2b 1130 1135 

LyR Lye Pro Pro i^eu Als C-iilrj Lys Pro S&r He Ala Pro Arcf Thr 
1140 lli^j 1150 

Asp Auii Gill LOU Acp Lys Lyp Asp Gly Val Tyx Poi Ali* Gin L£;u 
1155 1160 1165 

Veil Asn 'lyr 



(2) INRJli WATION FOR SBQ IP NO: 1 2: 
(t) SEQUENCE CHARACTEiRlfi'riCS: 

(A) LENGTH: 

(B) TYPH: nucleic acid 

(C) STRANDEDNESS: sin&Je 
<D) TOPOLOGY: lif\ear 

(ii) MOLECU] TYl'H: othci DNA 

(iii) l]YP<)THHTICAL: no 

(iv) AMll-SENSE; no 
(vi) ORIGINAL SQUFCB: 
(Al ORCANESM: liuman 

(j(s)isHQUHNCE DESCRIFnON: SFXJ> ID NO: 11: 
GAAGATACCCCACCAAAC 18 

(2) INFORMATION FOR SEQ D NO: 13: 
(i) SFXJ!IJBMCE CHARACTERISTICS; 

(A) LENGTH: 35 

(B}'rYPE: nucleic acid 

(C) STOANDEDNFSS: single 

(D) TOPC>!jC:>GY: linear 
Cji)MDlJBCULETYPB: other DNA 
OLO HYFOTHETICAJ-: no 



(jv)ANn-SENSB; no 
(vi)ORiaDSlAt^SC)UROB: 
(A) C)R(iAN!lSM: fuimaTi 
(xi)SEQUENCEI>nSCPTPTK)N: SFQ TD NC)i i:^: 
GCTTC3 ACACr tJTAGTCATAA AGGTGaCTOC AGTX:C 35 

(2) INFORMATJON H>R SHQ ID NO: 14: 
(i) SBQUF-NCBOrtARACTERISTlCS: 

(A) IJiNCiTH: 24 

(J3)TYHE; nucleic add 

(C) STRANDEDNESS; <irig.le 

(D) TOPOLOGY: linear 
(ii>MOLEt:uLE TYPE: other DNA 
(tii) HYPOTHBTT.CAT,: no 

(Lvl ANTI-SRNSB: no 
(V)} ORlGlNfALSOURCE: 
(A) OHGANTSM: huTnan 

(xOSEQUENCEDESCPrFTION: StiQJDrJO: 14: 
OGACATCTCf.: AGCtiyAGCrOA ATAC 24 
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(2) INFORMATION K)R StiQ ID NO: 15: 
(i) SHQUBNCB CHARACTERISTICS; 

(A) DZNGTH: 41 

(B) TYPE: nucleic acid 

(CJ SrRANDFJ:>NHS,S: ismglc 
(D) TOPOIXJGY: linear 
(it)MOLBC^OLE TYPE: olhcr TiUiiJeic 

HYPOTHEHCAL; HO 
<iv) ANTl-SENSR: yes 
(v/) ORIGINAL SOURCE; 
(A) ORGANISM: h^unm 

(x!)SEQU3INCH DEih'CRn'TION: SBQ ID NO: 15; 

CUACUACUAC UACUAGGCCA CGCGTCOACT ACn'ACICTaGlI GGGflGGCn G 41 

iZ) INFORMATION FOR SBQ ID NOrlfl: 
0) SEQUENCE CHARACTBIU&TICS: 

(A)LENGTLI: 5 Id 

(D) TYPE: nucJeic acid 

(C) STRANDEDNESS: doable 
(P) TOPOLOGY: Imear 
(iiJMOI-FClJLETVre: genomic DNA 
(lii) HYPOrrHEnCAL: no 

(tv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE; 
(A) ORGANISM: human 



(x}FBATLIRB; tJAOn J. ofHFI 
(M)SR<3lJENCHDBSatlPLT0N: ShQ ID NO: Id: 

TCTGTOGAAG UTl'lifiCiAiUytJ GAGAGAOCXiO CAGC'J<;sGATI5 CiVriHiUUCC ACUUlXiGOl.^U 60 

C'i<jA'K!TCT(: CfiCCTCTTCrC TCCTGCrCfTG GGAfiAWiTTlA TRTTTCCCTR C^GGGATGAAA 120 

RrATCTTTTT GTGCGGGtTr TAATIXSCCAT UTTCSTJiGTGC: CAAaGGAG'Xila AG'nUGCtJtSOU IBO 

GGACCWiCAG CTOGCCACAr! tlCJ^nTCCCttii riCAnTCfSTrif! CCACTCCrTC ftr.:C!ACCO0C?ft 5!4n 

(XX?A(5C^GC TCCTGGGABC GCTGCCCACC TCTGCCCCCA GCTOBBCGCC TGCAAGGAAC 3 DO 

CGACCACL'CG 'i'UUyGCJ'OOG CJr.iACCrrOCsC TiiGAiGsGAf5i5fl OAhAGCuitiCXi OaiJ'lX;'J.<>'OCiA JSO 

GGCSTCTCaRC CJiCTCTCAGA GGCTTATPCA TCTCATCCTC CTTTCCCTCC CCCTTCTTOT 420 

ITTTCAGACr GrcAUCA'X'CA ATiHAOC^CCAl' TAA'J'AUOCAG (.'AAGTCGO-JYI^ TAAAOOAAAA 460 

ACftCSOCC!AfiA AATATOCTTT GGATGTTJGCT TGGAAG SI 6 



(2) INTOFMATION K)k SHQ ID NDrl7: 
(0 SRQIJKNCE CHARACTERISTICS: 
(A) Lt£NGTH; 193 
(K) TYPE: nutikit; acid 

(C) STRANDRriNRSS: douhle 

(D) TOPtJLOGY: Jinear 
(it)MOLECULETYPE: genomic DNA 

(iii) HYPOTI IRTICX^kL: no 

(iv) AN-n-SHNS'E: no 
(vr) ORIOIKAL SOURCE: 
(A) ORGANISM: human 

(x) EEATlfRR: liKoii 2 of HIPl 
Cx))SEQimNCbDH£iCRIPTION: SBQ ID NO; 17: 

TSTTTTCCAT AACOCCOCCT CACCCSTCJfiAT ArTOGGC^iCC CACCATGAGA AAGGGGCACA 60 
GiACCTTCTGC TTTOTTGTCfl ACCGOCTGCC Tf^T^TCTAQi: AkK-OKMO'lXiC HOtXK'fKiC}^^ 1^*11 
GTTCTGCCAT GniTrUCACA AfeCTCCfTnrn ARATGGACAC: CCGAflCGTGA GTTCCTGGGG 160 
CTATfiGCCTVf «PA 15>.* 

C2) INFORMATION FOR SEQ in NO: J K: 
(]} SEQUENCE CHAR ACTBkLSTlCS: 
(A)LBNCfTEl: 1(14 
(li) TYPH: nucJetc acid 
{C> STRANDEDliESS: double 
(P) TOPOLOGY; linear 
{iiJMOLtfCUUiTl'l'H: genomic DNA 
(rii) HYPOTHEnCAL; no 
(iv) ANTI-SBNSB: no 
(Vl) ORIGINAL SOURCE: 
(A> ORCtANISM: human 
(x) FEATURE: exon 3 of HIPl 
Ck1)SEQUENCE DESCRIPTION; SEQ ID NO; J.8: 

GTBtlflT'lli' GCCCCTeOAG CfCCTViAJiGn ACTCTCTGAG ATACAGRAAT GAATTIiAGTG CO 
AC!ATGAGCAG GATGTGGGTG AGO^TrGGAUA ■iX;^rA(;'l'CAUG fitiCC 104 

(21 INFORMATION K)R SHQ ID NO:20; 
{i) SBQUENCE CHARACTERISTICS: 
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(A) J.ENG-m: 327 

(B) TYPE: titiclwi> atid 
(CI STKANDHDNtJSS': doufaJc 
(D) ■rOl'OLOGY: linear 
OOMOLECUIP^TYPB: genomic DNA 
(iii) UYPfJTHETICAL: no 
{iv) ANTI'SENSB: no 
(vi) ORIGINAL SOURCE: 

(AJORGANLSM: human 
(x)niATlJRH: exon 4 of HEM 
(xf)-SEgDHNCE DESCRIPTION: SEQ ID NO^ 20: 

ARTTCLTUGC TOCKCTiTCVr. TTnACTGITA TGTTCTTT.n' flTTCArTTflTC: TTTCCr.CT>C:C bn 
•J-CTTCf^TATLA AGGGCCACCT UAUCGAB^iWy 'J-AltSGCCACC TGTGCAGCAT CTACCTCAAR 120 
CTBCTAftUAA CCAACA'K5f::h CJTACCACTlCf.- /LTIAGTRACTO TClTXICOCK:* OfVTCTCCOT'.C ^ UU 

Chccficrnncc tcccctoctc cA-nuyt-j-Js:* auccc-rcocT bggctcattt gtcagctctt 240 

TCAGBT.WITA AOCCn^r! nCTTmTJWSfJ AARTCTWHC: ATt'A'^CSTACJC ChAaOTCfTCSA .^mi 
liAGACKiAAJia CCACCGCCAG GCCCAOG 
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(2) INFORM ATEON HOK SEQ ID N0:2S: 

(i) SEQUHWCE CHARACTERISTICS-. 
(A) LENGTH: 331 

(P) TYPRr nucleic acfd 

(C) STKANDEDNESS: Uotibb 

(D) TOPOLOGY; lineav 

(ii) MOLECU[-BTYPE: g&noraicDNA 

(iii) IIYPCyifffinCAL no 

(iv) AN•n-S'E^^!SE: no 
(vi> ORIGINAL SOURCE: 
(A) ORGANISM: human 

M FF-ATUIiH: exon S of HIPl 

{xi)SBQUENCB DESCRIPTION: SBQ TH NO: 21: 

GGGC'iXJAftOC ^^Tr:l:!TfxcA cctcgscctc ccaagtagct gggaccacag gcgtu'itgcca &o 

rrACGCCCGS CTGKSAKAGG GO'J<rn-CATlS TCTTCOttCCC: nJAnTV^aCTT (TfCTf^TTrrTC 

cco'jitCftGAh Tc:r:cwr;TTc ccacsgcaacc tgcagatgag tgaccgccag ctgbacgawu leo 

CTCGATtATUW; TGRCDTGAAC AAC3T«TAAG ■IXiGCJ-JXX'J^C CClXiAiCJCOCfl (iCKSArjGGAOfr 
AAGCTTTTCT GAA'IHSCTGAC ACTTCTCATO ACCCTCfcTOC! ACCCrCTfiAT OCJGGGGWJGr 3 0D 
C«33«3C^0GSGS ATCr^CCACrA ?iW5Crc:CTGG G 



(2) INFORMATION FOR SKQ IL* NO:22r 
01 SF^UENCH CHARACTERISTICS: 

(A) LENGTH: 470 

(B) TYPE; nucleic acid 

(C) STRANDBDNBSS: double 
{D) TOPt^LfJGY: linear 

<i() MOLECULE TYPE: genomic DNA 
Ctii) HYl^OTHETICAL; no 
(iv) ANTT-SENSE: no 
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fvi) ORICrlNAL SOURCEr 
(A) ORGANISM: human 

(x) FEATURE; t\on 6 cil iOPI 
(xOSRQlJRNCE DtiSC:iitrnC)Nr StJO JD NO: 22: 

MTTOTCECTG TCACT«TTt3A CTTCACCAlUa CTtSCATKGCC ATAftMCCCA CAAGGOTAAG 60 
ALTl'yGiAGC'r liG^iGSTTCTCT CJTPOTClTTTCiC CCATCCaCAT CfACSCATTtxSA iSACJiSGAerA L20 
fiCJTfTAGAGCG TGGBSGAGGG GRCAGGTAAC AGACCGGCCT CAGECTGTOG AG-^GTAAGCT lUU 

CTCTTTCu'lx; 'I'JxSCstsi'iXrtC; ■n"rr^CAG'i-j;A JwJAG-j'iSL'sArjA •j<is'jt'1'C!ac-j-a Cc-i^uuiAWTur 240 

CiAftCiTCA/iCC ICTTCCAAAC AGGTGAGTCT CTTCCCTCCC C?rrTAACCfA GGCTCTflATG 

GEAACrACCr AKl'UCCTAti-y C;Clt;C'l<;'lCC OjiaOflAAGlf 'jriCAUIJAd^AA GCiWJJ'AGUAA 3 CO 

flATCCARArA TTCACACCCC ATCTCTGG7C TCTCCAACCC TCGTCCASOG AGGGAf'TGAA d2U 

CCTCTTCAG'r A'JTnWl-J'J" TTAAGAfiAC^A A<3ai1'C"l'CG<S0 tUSeOHSCiWST 470 

(2) INFDRMA-l'KJN t<iR SHQ ID N0:2S: 
(i> SKOIJHNCE CHAltACn^RISnCS: 

(A) LENGTH: 565 

<B) TYPE: nucleic acid 

(C) STRANDbiDNKSS: doubJe 

(U) •I'OPOLOOY: lincm- 
(iiJMOIijCUJf?TTPR: gfiMimic ON A 
(m)TIYPQTHHlTCAL: no 
(fv) ANTI-SENSE; no 
(V5j ORIGINAL SOURCE: 

(A) ORGANISM; hutnfti^ 
00 rnATlJRR: exoii 7 nf HIE'] 

(xi) SliQUHNCEDESCRE'TION: SEQ ID NO; 23; 

TCTTCACC'JV "J'Jl'AA'ICGGO ATAC:nTTTAr' fTTATCTCATG GGTvSTGTTGT GAAGGTTAMi SD 
TOA ATTAR AT GAGGTAAAGC ACGCACAGAA TlXJUTCCnx; 0'JVJ'A'n5'i':x5 iSjUUCOO'J^^!:: 1/!0 
TCl'GCCt.'t'JV 'J'tlAAOACCCC OCiCTTTAATYT fif.'CTGGCTCT ACTCACCTTTC' TCCCTCACTT 160 
TTATTOTfTPA STATTCAACT CCCTGGACAT tJ'J'CCCGCUlL'l' <*J\srCUCjl^ CiGCCAlCCAOt! /!4 0 
GCAGOIGCWiC CTCGCCCCOC 'iVjA'JOCCftlGSCJT f:RTrrr«GAr IV.TJAfiCCAfX? TTTATGACTA iOO 
OACTOTNSiftC CTTTTrn^rrA AACTCCAOTC CTGTGAGTAC CGCSGGCCA^ AlTCTiVl-JAC i&O 
ATCAGATTCA EGCCAGAGGG AGGAlXXXJAtS CUTOflOGA'Kf •J<C:C:Ci:;AC;AC* aArCfTAjfiTf-T: 
TTCTCAt? JXiC CrrrGCCT&r CTCCTTXTTCT TrCAAAAGGr CCCtSGAGCTT CTGACCATTC 4£J0 
TCACtvATAAA AGAGCAGGGr CCAGGCTTTS GTUACCCCAG ■rAAAGCCC'CJ" GGC'l'iWCAC 540 
TCOTOCGTCC AGTO'l'l'ACAy GAl^CT hfiS 



(2) IMF0R14ATI0N FOR SEQ ID NO;24; 
(t) SEQUENCE CTIARACTBRfSTKJSr 

(A) J-ENOn-h 233 

(B) 'WE: nucleic jcid 

CC) STRANDEDNESS: doublt 
(D) TOPOLOGY; lincai 
(ii)MOLBCULE TYPE; gcnoTiiic DN A 
(iijJHYPOTlTRTlCA]-: nn 
<iv) AWn-SENSEi no 
(vi) ORIGINAL SOITRCE; 
(AJ ORGANISM: Kujnan 
(X) PRATlJRfi: exon £ of HTP] 
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(xi)SeOUENCE DESCRIPTION: SEQ ID NO: 24: 

GCJGACACCTC TMiCCChCTC CTf ICiCCCCnC GT.fcflTCiC'JXp'Cs UCfl^-^ATyCW liflGGGTAC^Cri' &0 

GGGCCCCTCC CCCTCGWlAtS CL'CCUCTGTTG OCTTCCCTOC CCTCTGGTCC CCCTCCCCTC 120 

TCACAC'jCTf Tf:c'.AAT«nsc:T Tc:rjif:c:tic:'Pc ccftf.:C:niSJ%CA occ-i^JCfliflOtf ei;flCO<XKsAL: leo 

CGCTTCATGG AGCAGTLTJiC AAAUTWlGTO GTTCAAGTAh CAGGWlTGGA GGT 3ii 



{2}1NH:)RMATJ0N KJR S^BQ id N0:2%: 
(i) SBQUENCE CHARACTERISTICS: 

(A) LENGTH: 578 

(B) TYPE: Tiiidcic Acul 

(C) STKANnBDNHSS: dnubJe 
(n)Tt)POljOOY: [iucai- 
(tOMOLEClTLBTYPE; gcsriCiniicDNA 
(iii) HYPOTHErECAL: no 

(fv) ANTI-SENSE: no 
(vil ORIGINAL SOURCR; 
(A) ORGANISM: humisn 
(x) FEATl.lRB: exoiifi y and ICJ of HlPl 
(xj^SRgUHKC£DESa<.IPT10N: SEQ ID NO: 23: 

TCAATCCCAfi OACCAltiGAO TTTRTClTCtiT TV;tiCfcr:C:C!'rX: 'KiWTT^Xl CTWCnaflCIf: 6 CI 
GGCACCA/IAUT! fJOVGGTGGCT GCTCTGTCCC CTACATCGGG CTGATGARGA CAiCCCAGCAC 120 
CCCTCAGGTC UTTC'H^UAiUC CCPHGiiTltiA AAl^TC'lttTJ' CTACCGCTtC AOCAWCTr.f: IflC 

ACri-AC?nCAa gcggctcatt* cagatccccc agctgcctga ggtaagcatg cccaaccaca 240 

CACCCrCGGC AC-iXit;A«A(5fe tCCCAGGTAC 'J^TPTl'A^yXi tCiOCiOWrsCiC ccTc:f:c:AArr •/nn> 

MiCjeAC?rA'PT TGAfirTATWO TCTCCGTCTT CAGAACCCAC CCAACTTCCT GCGAUCCTVA 360 

<5CCCTI5TCAG AACATATCAU CCClti-KSC^J'U <5TI5A-1<;0C-J«! CaOAOGCCK; ATCCiCCCCAT 420 

ASCGAGCUAG ICCTAGJiOaa C:r.ATrArfTPf: ATGCACATrKJ ATGCCTCTCA CCAGGTGAGG 480 

ACCArTTCrr AGAGAMCTT GGCCTTTCCT CTCACCTGCA AGTACAISGGG Al5AUOC-l<i«50 540 

(5GAGACCCTG GCCAAAIiCCC ArX^iACJiCM MlOACCtT 



(2) INFORM ATK)N K>k SHQID NO:26: 

(i) SEJQUbNfCECmRACTERISTICS: 
(A) LENGTH: 390 

^) TYPP-: nudefc acid 

{C;} SlliANDEDNESS: double 

p) TOPOLOGY: Uncar 

(ii) MOLECULCTYPE: gtMiOitiic i:>NA 
(iii}HYPOTlIETlCAl-: fid 

(iv) ANTt-SRNSB: no 
(vO ORIGINAL SOURCE: 
(A) ORGANISM: human 
(jt)FGATUEr-: exoo U of HJPl 
(xi)SE2QUBNCE DESCRH^OK: SEQ ID NO: 26: 

ARAAARATTT AAAAAA'JU'AA AeAlJGlCT<:A ^C:f:c:TTTA^'P TCGARAAAfJC GGGCATTCTC 60 
UCATATCaCT CftACICACrr: ACACACAGAA TTCTCTGGCT CTCTX3ACTTA I'jX.-J'CACTCU 1.20 
TrrrraRTCA ACCACAGAAT TTAITHUACA ACMUVriXiA •rGACA-J'CTTr COCfeCTTC:AT IHU 
TCAGCAGTGA TX:CtrrJ'CAA*J' •nXJAACAlGflTC! aAAATCCTVrP rAATAACiRAT GAGAAGTGAG 200 
TCCAAGCltiC: GTTCAACCAC: ^TCCnTAGG AGCTAAGTTA AGCCATCGIX: -XGCCTCAAAA 300 
CACT^iACCAA AGAOGAATTC TTAATGATAC TGGGUCl'JCT 'I'A^i'AJACAGA ACATCTTCAA :ifta 
GGGTTGGGGG CAATOUCTrA •iXXXfKfViiM 



W099««*86 PCTAJS9g/11743 



(2) INTORMATION FOR SBQ ID NO;27; 
(f) SEQLJHNCb CHARACTERJSTJC}^.' 

(A) UEiNCrTH: 547 

(B) TYPE; michio Hcicl 

CC) STRANDEDNr.S.S: douhle 

(D) TOPOmCiYr Imear 
Oi)MOLECULETYPB: gCTomii: DNA 
(iiil HYPOTJirrriC^L; no 
(iv) AN'I't^ENSE: no 
Cvi) ORIGINAL SOURCE: 

(A) ORGANISM: human 

(x) FEATURE: exciii 12 of KIPl 

(xi) SnQlJBNC;[iDHSCRrj'TrON: SEO NO: 27: 

A?l?lATCMTA ACCATQGATT VA-iXil?i(iVArX Af^TVAHrfir CTGGTAftlCAT TTAGftGTATA 60 
ATnWJ^eCA TTTC^Jiftr^AA TTRTCCCrWl ATTAArATCrA f:C:TTTT'AA'rT TCCTCPCiCTt^; J.2P 
AGrTTACAAr TAAAAACAGA GGGATAtJAMi CftCTAlltJAAR GCAAACTCAT TCCCCTTCTC ISD 
Tra>CJH5GGA CCftCTTAATT RAGCOArTAT ACAC?AlRAit?AT rAfrTTCATTri AAfXtClACAOO 240 
TAGAAAACAT GAAGACTC^IU Gl'ATAAlCrJXi UA'lX'HUCliCr tSCCTTTCSCGC n\7ACCAAAA 20D 
CAiOO^TAGAI- T7«tkTC!TT-A AATTTCCATt: ACAri?A(5f:frA finCArjfMTKlf:! CKSfcC-aCOTO 360 
TAATCCTAGC TkCTPTTGGGAG GCCAA^<30AG GAGGATTACC TGAGG7CGGG AGTTCGAGAC 420 
CAGCCTGGGC ARUAyUOlXiA AftCCCCCSOTC: TTCA^TAAAa ATGCAA'J-AAT 'V/oLiCOOQCi-K.^ 460 
'IVn'OGCAGO CSACrCITCrAAT CCCTAiSCTACT CGGGAAiG[?TG AGGCA'PGAGA ATTCCTTGAA MO 
CTTGGGA f;47 



(2) [NPORMATIONFOR SEQ ID N0;2&; 
(j) SBQUENCB CTTARACrrBRlSTJCJs : 

(AJ I J^NGTOr 436 

(B}TYPE:Tiuclcicai;i<J 

(C) STRANDEDNDSS: double 

(D) TOPOLOGY: linear 
(iilMOLT!f.:iJ[ 'I'VJ'H: geiiorait; DNA 
(tii) HYPOl-HfinCAL: no 

(iv) ANTI'SENSE; no 
(vl) ORIGINAJ.. SOURCR: 
(A) ORGANISM: human 

(x) FEATURE: cxon 13 of HEP J 

(xi) SEQUI3NC33 DESCRIPTION: SHQ ID NO: 28: 

CCCCCAGCCA CTCTAAAGAG GACCACAft'JT UCCCGGCCAT Cfl'l'CCCO'J'GfJ' 'I'AU'iXiTrGl"^' 60 
GATTGAGGGG CTCUTAATGA CCAGA-rGGiTC! CAACCCTXi^CJT riCCACCTV^CA RA(V?T«AC:TT 1:^0 
AGGGGAATCA GGTATTTACT" TGGAA«PA1<G GTAGGAC0CX3 CTTCTCCGGC CC AOGCCCGT laO 
GACCCGTGOr AGTGGGCGGT TGGCCTCATG ACCGGAGTOC CCCCACAGAG CCAiCXXSGGT"!' 240 
GTGCTGCAGC TGAAGGGCCA CCi-J^^AGPGAO CTGCAJwGSCW;; ATCTTCCCX^A GCACffrAGCAr :iOt) 
CITJCGGCAGC ACrjCCOCCnA rCTACTSTGAA TTCCTCC5GG CAGAACTOGA CGAJSCTCAGG 360 
AfiGCAGCTSGG AGGACACCGA (SAAGGCTUAG COGAlGCCJ'rG'l' CKSAOATWA AAOffOAriCiCC 420 

(2) INFORMATION FOR SEQ ID N():2V: 
(L) SEQUENCE CHAR AC ri-KRlS" 1^:5: 



(A}I.RNGT]lr4<iy 

(B) TYPE: TiiicJdc add 

(C) STRA^lDED^QBSi3: double 
CD)'IX)POLOO'5f: tiuear 
(!i)MOLBCULE TYPE: gcaomw DNA 
Oii)HYPOTtEF?TtCA[^ no 

(iv} ANTl-SHNSE: no 
(vi> ORIGINAL SOURCE; 
(A) ORGANISM; humAn 
(x) FEATDRR; *xoti 1 4 of I-JIPI 
(Ai)SRg.UHNCH DliSaiUnON: SBQ ED NO: 29: 

GACTTCABOC ChAGGAGU'l-C AAGGCItiClWS ITJAACAOTVSA TTQTCCCWT CCACPCCIACC fet> 
CrcCCTTvAC/i GAGCAAGACT GTCTCAAAAC JiAAACAAGEA GGACCTOHTTA GGGRCCCTGG 120 
CTCATTiaCAA ttliAAGGOAAHi! GGTCCCfOCrj- AiJUTTSiCfrCT CICl-PCAOCSTTCI CTCr^TTTAO T8« 
ATACAWrafVUi AC3CTCAAGCC ARTGAACAGC GATATAlBCAA GCTAAASGAG AAGTACAGCG 2J0 
ASCTOC?rTQR GAACCAC(.Vr ISACCKJCJ^GC iSrjAAGGfrAAe ACCCrcAfiJCC; CCTOTOAOCA ^00 
TCCJ<5CA€tOC! C!1::TC.CAiCCT«^ TAGGGAf3Af:A GCRRCTPTAGCJ CCTGrTHSCTT CCCCGGGGCC 360 
ACSCAArCCCT ACATTCATCT CTAAGGCATT GCCGTCATCT C'GGGAALL'AC ACCTmrAHi A2'^ 
GCTTCCTTyC Cl'CTO'i\?J'Crj' I'GOGCnVCSl'G'r CCi^GOerCCC: M'?CCCJ^T{: 



(2J iNKiRMATION FOR SEQ ID NO;30; 
01 SBQllBNCRCTTARACTTEULSnCS: 

(A) LKNCi ['H: 359 

(B) TYPE: nuc.lcit acid 

(C> STRANDEDNBSSi doaWe 

(P)T0POtX)C'iY: iTjiear 

(iilMOti?! XJLB -J-YE^B: genomic DNA 

(Mi) HYrO'lHETICAL; no 

(iv) ANn-SENSE: no 

(vi) ORIGINAL SOI JRCB: 

(A> ORGANISM: hviman 

(x) FEATURE: cxon 15 CifinPl 

(xi>SEQUnNCBDRSCRlFJlON: SEQ ID NO: 30: 

riCGTAGGAAA GTGATTCCTB ■re-TC-JXiAClTC; 'rACSCjeCACOC! ACA<rcrCTCAG TATGATTGTC 60 
CTAGAAGGAG GA-i<?J'OCrCT AACOCTCCC^ TCT^OTGGTT CAAGACACTG TTCTTC'TTrj- 120 
GCAGftATOCA CWTfiArr-A AACAGGTGTC CATOGCL'AGA CAAtiOCCAGG 'J'A<ykTTTCCA lfW> 
AOTAGAGAAA AAAGAQCTGG AGGAlTCGrr (JGAGCGCATC ACTfTACrrJifiG GCCAGCGGAA 2iO 
GGTGAG1'G<JC ACGAGsGAGCA CKTOiy^AAAT fykfiGGAGGGG GCTGTTGAUT TiSUrrUGCGiiO 300 
GGCTTTOTGG CCTPCTGCTC CATGGBCAGT TCreiGGGlX; GGrrj^GsGCaTC ACAf^ARCAG 3!s9 



(2) INTORMATiON FOR SEQ NO:3l: 

(i) 5JEQ0ENCE CHAliACTERISTrCS: 

(A) LENGTH: 209 

(B) TYPE; nucleic auUI 

(C) STRANDEDNBSS: doubte 

(D) TOPOIjOGY: linear 

(ii) MOLBC;iiLK'l^YPB: genomic DNA 
(iiO HYPOTHETICAL: nu 



(fv) ANTI-SENSE; no 

(vi) ORIGINAL SOURCE: 

(A) ORQANISM: human 

(X) FHA'imE: exoTi 16 of HEP! 

(xi)SEQUENCB DESCRIPTION: SEQ ID NO: 31 : 

GTTGAl'CGCT Trf!<VLrGTTT TfPACATTTTT ATATOCTTTG TCACTGTCAC CCJiSJiTCAGA 
rjTCCCTCTGT ITTTCTTUrc ■i'J-J'CA^CTC AAfeAflOAyU'r UGAAyri\;-]-A C5A«A<Sent5A 
AGCAGtSAACT TCSCSCAf-'AAGC CARCeCGAGC TTCASCTTOT GCAAGGCAlRf: C7GCSAAACTT 
CTCCCCAGGT AAATACCTCC 'J-JNTJ'TT'IT 
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(2) 1NK>RMATI0N FOR SEQ ID NO:32; 
(j) SEQUENCE CHARACrmiSTICS: 
tA) LENGTH: 4^5 

(B) TYPE; nucleic acid 

(C) STRANDHDNESS: double 

(D> '[TOPOLOGY: linwir 
(ij)MOLBCTJIJRTYPi£: gpnomfc DNA 
(LiOTlYPCyilfffiTICALiiTo 

(iv} ANTI'SENSE: no 
(V3> ORIGINAL SOURC:tJ: 
(A) ORGANISM: human 

(x) FEATlJRB: exon 17 of HlPl 
{xr)SHgOENCEDESCRlFTTONt KFX) ID Nl>: 32: 

CCCCt'ACJ^JC ;^ATCAGT(yPG TCCCCtSGGAtJ G«AATCAC5AG "iriGUAftCiTJA AhlCsACiCinATir 60 
KCCTKC^C^HG TCCTTCSCAAC UCGGTCGHW: <rVTCX\HC.r.Tr. TVlfiGAAGT'AG GGACTGTTTA 120 
ACTCAACUAts COTCTCCCTC: TTTCCTTOK5 GTCAL'CTTTG CAGlXJAtSAAU CAftACTCICfiO 180 
AeCCSOArJTTC GCCGAGCTACJ ftUAACXfJiOCSTt r;f:ArACCnY; fiTHAGTOSCG CAGCTCATAG 2^0 
GGAGGACfGAA •iTf'^'KTK^C.TC. TTCGGAABGA ACTGCAGGAC AL'TCAWCTCfl AAriTfJCJOCAf: liOD 
CACACACnRT CACGGACATG GACAOSAGW ACCArCTfT^O AATTTfTV^Cr GAGGGCCTCT 3G0 
i::CGrATGCAC GtJACJOCfGGG ^iCCAfXTCCGG GGCTGCTGAG AAGGGQTTTG GUGCCU'KjOC 
CTBATTGltSC AC^CATTCTG TAGGTOTAAT GCCAUfAtitSU: L'L'HUCATTGC CTCCACATr'K: 4BD 
(lATCA ^Bfi 

(2) INFORMATION FOR StiO ID NO;33: 
(3> SEQlJRNCBOHAkACTERISTICS: 

(A) LENGTH; 468 

(B) TYPE: nuwleic acfd 

(C) STRANDEDNHS&: double 

(DJ TOPOLOGY: Imcar 
(ri)MOLECULBTYPE: gCTiWC DNA 
Cfir) HYPOTHETICAL no 

(iv) ANTJ-SBNSE: no 
(vi) ORIGINAJ- SOURCE: 
(A) ORCiANlSM: human 
(X) FEATURE: cxon 18 of JllPl 

(xi) SEQUENCri DELSCRllTION: SEQ ID NO: 33: 

TTACIWGCTT GBACCTCAT'J' tfGCICATOACT T>RAGrTAAKA TGCTAAGAGC CCCAGCCAGB 6D 
TCATCCTGtrr OftGClTCATT ATGGAGITCTA GOGCAUACTC ICACCTCCCl' GGACC^'m•T l.JI) 



TAGAATCTAr G'i<jCCArJC'l-l" OCSCAAAnACC AACC^^hRaT RCMTTTCGTO GfiGTCCThOGn 180 

A©C4C5T(3CJnGA GCMGTGATA CAAGAL'GL'CC TGAftCCABCT ■IVAABAAUCJ- UCIVJVATCA 2^0 

GCTGCGCTGG O-J^r^TCCAOCiT ACfcrTTTfiCAA TTYTCflf.'IinCT G<5rAGGGGCC AGGTCCTTAC 300 

AlSCCn'nAGAC TCTGTTGATC TPGAATCTCA TCTGBGR[;n' ACiCl'CAftiOe IJTC-JOJWSCCC 360 

AGCABCM'Gl' CAttCATTACC TTAGGGGCCC rrftflSCCCCA TCCTAGATCA GTTACATGTG ^20 
GfflAJlCTCTGT GCATTAGTOC CTATACACTA G-JArrTTAGl' A"J'n"l\J':*J' 



(2) INTORMA'nON l^DR SEQ ID NO:34: 
(i> SBQIJHNCB CHARACTERISTLCS: 
(Al LENGTH: 393 
{B> TYPE: iiLK^]«c Atifl 
(C) STRANDEDNt£SS: double 
(DJ TOPOLOGY: linear 
(ij)MOLECllLnTYPF^ gejiaitiic DNA 
(Lii) HYPOTE-ltirnCAL: no 
(ivj AN'PJ-SHNSE: no 
(vO CJRIGINAL SOURCE: 
(A) ORGANISM: human 
(x) FEATURR: exon i<J of HIPl 
(Ai)SJ?QUtiNCE DESCRIPTION: SliQ fD NO: 34: 

CACTAGTAAtS Cl-CCTCCJa'TT CAGTOCTTAA TTAACGAUISA TGAAOCCACi:: TATY^ARATiCT 60 
•j<iCTCT?<5AC7C TTCCCCTGTG TJ'OOCri'CTCa C:Ar.AirACCrr CCTCTCCACG GTCAUA-rcCA 120 
TrrOCACCTG CA-JVXlACiCAA TTCJCSAGAAAA GCTXJtJAWUCA I^TA-.iCTrjrJflf: TV^rrfJOlGAAG 16{> 
<^rAW5WlTGG CCAAGGACAG TCrOTGl-Cf^G CTACTVyiTGG CCAGACAGGG TTCAGAAGCA 24 0 

CCTGAATGCG GOtiATAtil>iA CRf:C:TCCCTC TGCATCAAGA AAQGCA'ICr^ OTICjAACTCAT iOD 
ACAAGAAACO CATHTAGGCA ACTCATAAAA 0O(iGM!GAGA COrrTATGAAA OTGTCACCAT iGO 
CAACCAGACr TGAGAAAiCTT CTC'iriTCCAA TTC 



(2) INFORMATION FOR Sl^Q ID NO:35: 
(i) SEQUBNCE C^HAllACTERISnCS: 
(A) LENG'm: 421 
(B} TYPE: mn>l«i^ nc\6 
(C) STRANDEDNESSf: double 
5^) TOPOLOGY: linear 
(ri)MOLBCULE TYPE: gunomit ONA 
(ili) HYPOTHETICAL: no 
(!v>ANn-SENSR: nn 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 
(x) FEATURE: tuton 20 tif HiPl 
{xnSRQUENCE DHSaitl'TION: SEQ ID NO: 35: 

G^CCTGCCCA GAAGGTAAGA AJXSGCCaAGG ACAKTCTCTC TUGGCJ-A^yPG ATCRrCARAC 60 
AQBOT-rCAGA AfeCACCTC^A TSCGGGGATA UimACAGsGflCC: CrTrT<9rATC AAGAAAGGCA 120 
'i^TARCCAAC TCATACAAGA AAGGCAliGTA nCCAATTCAT AAAACQGUAG (iA<iA«€S<?TaT 1H(I 
CAAAGTCTCA CCATCAACCA {SftCCTCAGAA ACTTCTCTTT CCAA'JCCTGC CATJirATCAG 2iO 
TGGACnX:i<; CATTCOATAA CCCTGCTCGC CCACTTGACC ftCfOCAHRCCA OTCCTL'ATCG 300 
TtSCCACCACC TGCCTCTAGAG CCCt:A<.-CJ<3A CCCTOCCGAT OOTGAGTACT O&GGCATGAC 350 
GCiGCTCTICA TGGACCAGGU GACC?»<0(XSaG CCTTTAAAAU TinClt;TTW GCCGGGCGCA 42D 
G 
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(2) IJ<JPORKdATIONFOR SBQ TD N{y.3(r. 
(i) SEQUENCn ni-IARACrJlillISTIC&: 

(A) LENOTHr 49S 

(JB>TYPE:Tiiic]dcad{i 

CO STRANDBDNBSS: double 

(D) TOPOLOGY: 5mear 
(iiJMOI-EClJLHl'YPE: gcriomit; DNA 
m EIYP0THBTICAI.1 no 
Ov) ANTI-SRNSB: no 
(vi) ORIGINAL SOURCE; 

(A) OUGANrSM: human 

(x) FEATURE: tmoo 2^ of HIPl 

(xt)SEQUBNCr.DRSCRimON: SEQ ID NO; 36: 



Af3GOC5GAfiC3C AGGAGAATCU CTTGA^CT^fc fiRARCOKfiJlR TOTQMGTGA GCCGAGATOG 60 

CGCCACTGL'fl U'lCCACCICTG GGCAACAJWSR GCGJtfSRCTCC ATUTUAAAAA AAAftCWTT^T 120 

AnsiCCITTRT ATCTCCAGCA U'lX;ACCCyi€!r: QCf<^t>.7Kr.C\ f?rATGaCA03 (3AAACCCTCG IDO 

t:OTACCTGGC trJXXCTCriAR GAAGAQBGAA GCCTT5AGAA l^t;t;GACAGC ACACCCAIK.'A JiQ 

GtSAACTOCCT G7W3CAAGATC AAGGOCAlTlC: CCCSftfTinrrtrT TnGAnrTuOTA TCATKSAGGA 300 

GCaTTGTTAT TCrrTCTClCirT RTOOSTECTG GTGBATGGCC AUGGAA'J^'U IfA-JXl^PTPTf: :?(iC 

AGCl'ACST'JCT TTCTGCACTT ALJAACTTGAT TCTACiAAACh r:ATTr;TT?A7iA ATTGSAAAAT ■920 

CTCOTCTSGOT GCAGUTGA'i-J-P ATCCCJTfiTAA TCCCAGCACT TTGGGAOSL'C GflUfJ'CAeGAtS 4410 

GATCACnXiA GGCr^KyAfr i^b 



(2) INPORMAllON FOR SEQ ID NO:37: 
(0 SEQlJfcNCECHARACimiS'rKJSr 

(A) LENGTH: 427 

(B) TkTTi: micJeic acid 

(C) S't'l'lANDEDNESS: rinuhl& 
p) TOPOLOGY: liiieai- 
anMOLBCUL-H-rVPE: genomic DNA 

(iii) HYPOTHEJnCAL; no 

(iv) ANIl-SBNSE: nu 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FRATUI^: cxon 22 of HIP I 

(xi) S'EQUENCBDRSCRIP'llON: SEQ ID NO; 37: 



CCC-XV3TGCSCT TGCAGAAGGT OTrfGCreGG TVyrJCCTrriG CCTTGCCA'J-U TltJ'IA^CSOrJT 60 

TACAGATDGC AGAGCiAOAAG hfSArJiGGAGG CCCCAAGGTC AOl-^XlAOCCT TTGTGATGTe 120 

^TTCACAO&Ae CTCCTGCCCA GGGGACTX^GA CATCAACCTM:: GAGGAGCO-GG UGUAiLX:;i'GG'r IHQ 

GCiftCSAAGOnR ATGGCGGCCA CTTrjAGCT^CC TATTGAAACT GCCACGCiCCA CAATftraGGT 240 

A(5G*C5GTTCC 'TCSCACiGATCT CCTGAAACGA TGCCTTl^CA GC'IXiCCCTTC TGCAACACTG 300 

CTCATTAAAC AlVSTCfrCAflT CGTTCATTAA GGCCA'J^OGCA ACCf^CTTAAG ACAt5AAACCA 

GAATTTGCCA GGCACAGTGtJ ClliAlCCCTC! TAACCCmGC ACCTTGGGAG ClATCJuCTTGA d20 

GTCCAGG 427 



<2) INPOllMATIONFOR SEQ ID NO:38: 
(r) SEQUENCE CI rARACFERISTICS: 
(A)LENGTIJ: 367 



^ PrT/US9!»yil7« 

(B) TYPE; nudwPAdd 

(C) STPANDEDNHSS: doubJe 

(D) TOI'OIjOGY: lintiar 
(ii)M01i3Ci;i.R TYPE genomic liNA 
(iiij HYPOTHETICAL: no 

(iv) A^rn-SENSE; no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 
(x)FEATURB: exon 23 of HIP 1 
(A()SPQiitiNCE DESCRIPTION; SBQ I.O NO: m 

CCCCCTGAAT AC01^A^??ii(7r rTGCATTCTr TTCTGftCTCT CTCAAtSAATtJ IXiiyeCAGCJ^A 60 
Cj'KSOSCSfiRnT TCCAGATTCA UOTlTCCCjajG CT^CCACJKT: ATRTTrXv^iCT RTJiAGTATAQ 120 
TAftBACATTA GrC!CA1<r?CTT AATATOCAAG BCACATTTAG ARACCA'IX^L'T ■J\rrri'ri\;At; 380 
AWWJftTGCP CAGCAAATCC CGACfCAOOAC ACIAt^AfinHifrr f.'/iAATTO«AR RTfiAATRAAA 2i0 
GSTCGBTCltJ A<JC<5C!CJ».Tr:r TGGGACCTAC GGGAC3CAGGR TCl'GTCTTCC TCACAT-JX^T 300 
CTATACTTTC TATACTTATT AXJUtfAA'n'AQ AlGC!AGa«3CAG WCCAC:CC:ac OC!C!C;aJWX:CX 360 
Tr^AGTTO 



(2) INP0PMATK3N HDR SEQ ID NO:39; 
(ij SDQUliNOH CHARACTERISTICS: 
(A) LfcNOTH: SQQ, 

OS) TYPE: nucleic acid 

(C) STRAMDEDNESS: double 

(D>'lOP0L0CiY: litte*r 
(tj)MOLBClTl,BTYPtir genomic DNA 
(lii) HYPOTHB-l'ICAL: BO 
(iv)ANTI-StiNfSE: no 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: bpman 

(x) REATURR: ex on 24- of HIPl 

(xi) SBQlJBNCE DBSCRIPnON: SbQ ID NO: 39: 

OCCDGCAGAA 'IXJTTCCAGrA ACCTVTAGCAC CCTlXJTTM^f^ TCCCTTTCCC AUTTCLAAUCJ' W 
•IV^CCTTTGGC TAGGA<5TGl5y (5A*f:AKAAC?C GTCGTCTTCA TTGA'KJ'i'jVSG A'^CyTTfiATCT 120 
CAETGTATCC ■iWACTTCTr TGTOTXSGCAy GA'lCCOTOC?! TClCTnTACCA GCCTCATX3CA IBO 
AGCtATTCAC (WGCTCATCG "IXJGCCTCTAA RRAjCTCTCCAG AGAGAGATTG ■I'OGAGAGCGC 2At} 
CAGCGTGAGC OTBGGTGltSG GCmraGGCA GGAAG3!«GAG GCAlOGGl'GA CACACTCCCB 300 
CTCCAACGGA CTCTGTCArR CTOCL-GTCIT AO'J^J'fGTeTi:! TCCATrTPGAG TACAGAGCAG 350 
CCAC-lCCTC?!' ASATATCAGC AUAGGCCCTC Cf^^JbCiAAGTC AGAGCTCL'AG GACClCCCCA 420 
GAGOGMGCC A6GCA'lXi'I<;T CCCAACTCCA GCTCCCTTGG C'AlCAGSGCaCiA CATTCTTOSA 180 
tiCVFGCry&fQ GCAGCCCTOT TT SOX 



(2) INFORMATION FOR SHQ E) NO:40: 

(i) SBQUENCH CHARACTERISITCS: 

(A)LENGm 437 

{3) TYPE: nucldu acid 

(q S5TRANDEDNESS; double 

(D) TOPOLOGY: lineoT 

{iOMOLECUUvTYPB: gcnoniic DNA 

(iii) HYPOTf-tmiCAL: no 

(iv) AN'll-SENSB: na 



wo 



(vij ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FBATLTRB: won 25 of IIIPI 

(xi) SBQURNCE DESCIRtP'JlDNr is EQ ID NO: 4f): 

TTTTSGTCTC TGAATCTTCT IVriTTTTGT WlAATUGGAA TACTAATXSCT rA'JVrC'l'CAG 

AGTTAUTATO AGGATGATTT COKWAATAT ^^'Ky^7>,TA^^,i> OCIftdCTf^CfJA TATJiCTACAT 120 

CICTCAATAAA AGGTOGCTAT TACTATTTOT TATTTCCCTA GGGTJLCAGCA IXXlCCTflAAG l&O 

AGTTlTAlliC CAhGSMCITCIT COATRGArAO AJuCJCiACTTA'P <?TC-'AnrrTf:C: AAiSaCTfiTGO 2^<i 

GCTGGGGAGC CACTIiGTCATG GTreiAAGTAT CTATTCCTAC CAAGUGTCUT COCATGACCC 300 

CTCTTCCA'lT OATCCACTCC AAACTAATAifJCf TAAflCAnCCA )\AAAAAAATC: TOTCCCTTAG 3(50 

AAATAAACTA TOGATCAGGA AGTCAArAGG ACCGAGPITA CAAGGGAGCC TCSGCTCTCCC 420 

AGGGGACACA i^CAGO ^'-^'f 



(2) INFORMA'J'lON K)R SEQ E) N0:4I : 

(i) SEQUENCE CHARACTERISTICS; 

(A) I£NGTH: 351 
|;B)TYPE: nucJejc acid 

(C) S'iRANDEDNESS; doubk. 

(D) TOPOLOGY; lme*r 

(ii) MOLI3C;UL-H ITPHr genomic DNA 
OiOiiyPfJ^I'WETICAL: no 

(jv} ANTI-SENSE: no 
(Mi) ORIGINAL SOURCE: 
<A)ORGAMlSMt human 
(X) FEATCJliE: cxon 26 of HEP! 
(xt)SEQLlENCE DESCRIPTION: SHQ ID NO: 41: 



GOGSftCSC(rPGK CTCTCCCAJ3G GGACACAGUC Cf450nM}ti(yr- CCCCTCCCTC TTTAGCCAAW 60 

GGCGATGGBG •l'Ga'A<rlX5GArj GTCWCATTGT GGAGGAGTTG CAUCi'CAT'rT CCJCnCTAACf? 120 

TAii'rCCCTCT' TCTCTfTPTTTC CATCAGGGAT GCAGOTSATC TTOTWHrACA AGGCAGAGOG IBO 
AAATTTGRGG AGCTAATGGT G'JXJflTC'fCAr CTAARTTGCTG CTAGCACAGC CllftGCl'KSTG 

GCTGCATCCA AGtS-lAeGACC TOCC^GOACC TCPTAGGACG CJiU<iAAGtiCC 'KiCTTAKAa;! 300 

GTACTAGGCT ACCTTAAAGA GTACTTGGCT GOO-l'TAliliCA C'^^f^TTGGC:T G 351 



a) INFORMATION FOR SI3Q ID NO:42: 
01 SEQUnNCRCHARACl'HRISTICS: 

(A) LBHGTH: 41B 

(B) TYPE: nuclcsit; a^itl 

(Q STRANOTDNESS: doubSe 

(D) TOP0l-f>GY: linear 
(iiJMOLEiCULETYPE: genomic DNA 
{in) HYPOTHETICALc no 
(iv) ANTI-SENSE; no 
(vi) ORIGINAL SOURCE: 

(A) ORCiANlSM: human 

(x) FEATURE; cxon 27 of I-IIPl 

(xi) SEQUnNCEr>B5CR]P'r[0N: SEQ ID NO: 42; 

CTTTTTATAT GATAGATATG TL'AGGAieCTG ACTATAnUCA GCAGATTTTG AGftAG<J'lWL'r 60 
TGGTGATTOC CGrri<iGOC<C ACATATCTTTT GCTAAGAACC ATCAUAGCAk TTATCTCftTT 1^0 
CAGTOOTTGl' TGCTCTAOCST GTTGTATBAA CC'^AAA^^^^ CTTTGTCCTG GTAGGTOAAA 160 



GCTGATAAfiC ACAf:CCCCAA CCTAGCCCAG CTXJCAGL'AGG CCTCTOGGGG AGTGAAL'CftfcS 2*Q 

RCCACTGCCG GOGTTC'iXiUC CTCfAACCATT TCCtf3rJf:AAAT> OACArATC^nA AfJAGATAfSflT 300 

AGccrri<;c;^ Ah^nGACcCT tttcttaccc ALxxrrG'mGA <iCTcrTCTCT gcatcctjsju 3 fin 

CTGTGATCCC TSAUCAAA'iW C^ACAGGACTO TCTC'TAAATT fJTTTCATATT TTTCATCT IIS 



(2) INFC^liMATIONFOR SDQ [n NO:43^ 
(i) SEQUENCE CHARACTERJS'I'ICXS: 
(A) m^JGTH: 279 
(E) TYPE: nucleic acid 
(C) STRANDBDNESSj double 
CP)TOl^OL0GY: linear 
(Ji)MOUBCULDTYPE; genomic DNA 
Ciii) HYPOTTHBllCAL; no 
(iv) ANll-SfENSB: no 
(vi) ORIGINAL SOURCB: 
(A) ORGANISM: human 
{jc}FBATURB: axon 28 of HIPl 
()ti)SBfilJKMCE DESCRIPTION: SBQ. ID NO: 43: 

TPTCCACAGA GCATIXXCAT TGGCTGOCTU TCAGC^lVSCClft OTChVCClirjn flTMRTkftTTTC CO 
Al'«A<5iflC!C!'rT CTTOTTTfCA, 'jWirKiCIftCtA CWkCATBGAC TGftaiCTlSAC; 1^(1 

aCRGATCAAA CUOCAAfJACA IGGRTTCTCA GGTJ'ACsGUlt; yj'AGAGCT^C! ^bA^^'TGJlATf^ 180 
GCAUAACGAC COTCAAAAAC ■IX^GAdACCT TTOGWiAAAG CACTACGRGC TTSCTGiyiXSV 2<l<> 
TRCTGAGGGC I^GGGAAGWu^ iSTAAGCTGAC TCAAAGGA'1' 2.19 



<2J INKJKMATIONPOR SHQ ID NO:44: 
(i) SEQUENCE CHARACi'fiRISTlCS: 
(A} LENGTH; 37)5 

(B) TYPE; nutlaic acid 

(C) STRANDEDNBSS: cJuuWe 

(D) TOPOLOGY: linear 

( i OMOLECUI-H TY Pt : gpn omi c DN A 
{iLi> HYPOTHEnCAL; no 
(iv) AtmSEKSE: no 
(vc) ORIGINAL SOUKfJE: 
(A) ORCANISIVJ: hmnna 
(x) FDATIJRH: exon 29 *nd pftftial cds of HIPl 
Cxi)SBQUENCB DESCRIPTION: SEQIDNO: -14: 

aaCATAAAl-l- ATCATTGTCT rPTTAOGflAlCA GACOCATCTC tTACCTACACT GCAAtfAflG-fG eiU 
GIAACCGAAA AAGAATAGAG CCftJiaCCAftC ACCCCATATG TCAGltSTAAA TCCTIV7PTAC L20 
CrATCTCGTtt ■1^GT^JTT^^TTT CCCCAGCCAC AOCJCCAAATC CTTGRATTrcr CAGGOBCAGC IBO 
CAOACCACTG CCATTAODCR GltSCCGA<?«A CATGCATGAC ACTTCCCAAA OACYCT^OTCC. '2A0 
ATAGCGACAC CCTITCTCTT TGGACCCATG <5TCAl"C'i<;'J'C» TTCTTTTrrCC GCCTCCCTAG 300 
•WAlSCATCSaA GRCTGGCCAG TXJCl<;OCCA'P CARCAAGCCT AGGT-ACSAAG AtiOGGTrjCTC 
GSCSQ!(3CAGEG CCALTCAACA CAGAGGACCA ACATCCAiGTC Cl-GCOxSACTA 1TTCJACCCCC 430 
ACAACAATGG GTATCCITTAA TAGAOGAGCT GCTTICTTCiTT TnTTCJlCAGC TPGGAflAGG<i 44>0 
ARGRTCTTAT GrCTTTTCTT TJ'eTGTTTTC TTrTTASTCP TTOCAGTriX.' A'l'aATT^CC^ SdO 
CAAACTTGTG AGCATCAGAlt OCiCTGATGGA ITCCAAACCA IS^Ai^ACThCO CTGAGA'TCTG 6DD 
CACAGTCAGA AGGACGGCAG GAGTGTCCTC GCTCn-CIAATn C^JiAAGCCAT TCTCCOCejC 660 
TTTGGCiCACT fiCCArGGATJ- TCCACTGCTIT CTTATGGTGG TTCGrrtSOtST TVTVKil^T- 'i20 
TGTmmT TTTTAAGTTT C/kCTCACATA <iCCAACT<^ CJCiAAACOCiC^A CACCOCTGGG 780 
GCTGJWyrDTC CAGGGCCCCC CAAC'l'Gfl'GGT AGCTCCAOCG ATGGTOCTGC OCAOPXCTOT 060 



wo 9Wei098*t 
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GGTCAGGCGU 

TCATCACA'fO 

GAGTGftTDCC 

GAAGAJlCCC'l' 

CATCTTTClXi 
GTTACTGACT 
AGAAAGCiACA 
CCTGTGGACA 
CAGCAO'JTTT 

GTCCATl'Tlt^ 
AATOTTGAA'l' 

OG'i^GCTTr.f: 

AOOCTGATtOT 
GAAACCCOCA 

CCAGTGAGCT 

ATT^'.TCTGCC 

TCAGCTAG'J-J' 
TCAGCCCTCA 
AAAAAAAAGA 

AAAATTATVn? 
CGACTTAGCC: 
GTC'1>CACATC: 
AGAATOGCTA 
TCAftGATGAA 
C?rAMARTGA 

ACAA(3AAGTT 
CTCTAGAGAA 

gtttaatccc 

GACTAGCCTG 
CATGGTCOCA 



TCTCCQCCTC 
AGC1>0CT<31AG 
AARCTGIAATC 
GGGACAAGAG 
OOTO^J^TGRA 
ATOGTCTCTG 

OOTACGCAAT 
ATACTTGCTT 
AAAGAGATGC 
CCCOGCTACA 
TfiGATCCCAA 
GAGCCAGCX^G 
GGA'ltJAGCAC 
CCTCTTCTAG 
CCTTCTrrAT 
CZ^A^^^AAl-A 
TGAACCAACC 
TAJVGSTOGGAC 
ACCTCCATAC 
OCT0C!TTTCT 
CTBTCCCAGT 
AAAG'rtGCTA 
AftnCGGTGGC 
GGTAGGATCA 

ACTCAieGAGG 
GAGA'J^AC«:tC: 
AAAAAAATTA 
CAK3CU15GGCA 
ATTOCTTGAG 
TAAAATlX^AA 

<^c:r;AncTAA 

TTGTGCCAGT 

TTCndTGTCA 
ACTCCAGCCT 
GAATGATCCT 
ItSAOTOATAA 
AGGAAGHI'AGC 

ccc:OA'PTCiTC^ 

GAjKTTATTTA 

AA/iTOEGTCA 
AAAAAACAC^C 
AAGATGTTTC 
A<5CACTTTGG 
GCCAACAlt:iC. 
GGCCCC^ATA 



CACACTCACC 

CCGGCGGAAA 
tiCi'iOCACACA 
GCTOC?PCSAGA 
'iX^'AACiOGLSGC 

TGACAAAUCC 
TT^CJTACCCTT 
TTGGAGCAAT 

-rrrAnccTTCA 

AGCAAGGAGA 

OTCCAAC'l'CC 

ATGGACCCCA 
CT-ATACCAAC 
TTGCT-AGCTT 
CCCA'JM^'fOCC 
ATTCAAAAAA 

ix.xxs'j'rAAGrJ 

GOTT?A<5TATT 

CSraCTCACTT 
CGGGCACAGT 
UCTGAGGTCA 
AAATACCAAA 

CATTGTACTC 
CAAA'JX;<^C 
CAT.TWfirfiCA 
CTCAGGAGTT 
AAAC:A?iAATT 
GGTGGGAGAA 
GCACTCCGGC 

<?Rc:cCf:AGCC 

GAGTGACAGA 

CGGTTAAGAA 
TGlTCACiATGT 
TAGACAAAArr 
ATCACACACO 
CTTCTAAATG 
TCACAA<iATG 
TXX^^ATCTCITT 
TOTACATAAG 
GAGGCAGGGG 
OXSAAACCCCG 
ATCCCAGCTA 



AAGTCCTOGC 
GCTy:AAAA;feG 
GCCTCTOTCC 
GCCCAiGTreC 
TCAACAACAC 

cGixJ-jHiccrr 

CTGCCTTGGC: 
GGAAGATCAG 
AGQWrrMTT 
CAGAAUT'lXIA 
AGAAOir.AGAA 
TCATTTGGAlCJ 
■J1TCAGCCAC 
'raAACAGCTTP 
GCATrrAAtiT 
TCiATTCG'KW: 
TTGCTACATG 
TACCCCACfifJ 
CTCTCTCCCA 
AACTATr:AGn 
TTTTTTGATTC 
AAAOOAftTCC 
OTAGATCTCA 
Gtj\n*CACGCC 
CX^AGTTCAAG 
AATTAGGCGA 
AlC!?^ATf!Ar.r.T 
CAGCCTCGOC 
AAACAGTCfPR 
TGCCTOTAAT 
'ITiAl&Atf^aWC! 
AGCCGOGCAT 
■ri\iCrlTGAGC 
CTOS5GTnA)C:A 
CAOBAGTTTG 
ttOGAGACTCC 
AGIAAGCCACA 
AGL'RCll'AAA 
CACATOATTA 
CAAATTXSTCC 
TTGCATTA?fcA 
AGTCACTGftG 
GGACACCAAC! 
ACTTAACTTT 
AGAAAGAAGG 
CGGCTGSGATC 
TCTCTACTAA 
CTOGGGAGGC 



CCACCCAGTC 
CACAAGGAOA 
GCCTYTTACAA 
OCtJ^OACGAC^JO 
TACTTCCCTG 
(aX^CGlXACi-AGA 
ATCCTCTTOA 
AlViCAA'l"JT50 
ACyPAACCTCC 

c^ruix^A/LTrc 

«7lAAGCCAAfi 
CTCTTGGGTC 
ATGCCCCACG 
GCCAGGGATG 
GACCi-J^^TlSLA 
TAGCCATCIAA 
AGGtWriTGlXS 

cccTcrcrrcAA 

GGAOCrCGGA 
ATTTk^iAAATO 
AAGGCTCAGA 

GCCATTCTAA 
'fGTAATCOCA 
ACCAGCCTOT 

GCGTAO'J^^iAC 
il?iACCCCAGA 
AACAACa'A^CA 
nTGTAATGGA 
CCCAGAACTi' 
CTf^raCJCTlTCA 
GAT^TGGAT 
TTOCCiAAGfTT 
GaW3TGAGAOC 

ATCITrTTTAA 
GAAATCTITA 
CAGAAGCAiGA 
CTTTCGTAAT 
TAnt^'l^ACT 
ACGGGAATCA 
CAlTCAlCTAiSU^ 
AAAATCATGG 
GTAAGACAGT 
CCA<?ACATfJC 
ACCTCAGGTC 
AAATACAAAA 
TCAOGCAHOA 



CATGCTCCAG 
^'It^AO'J^COr 
<3GGAjGAAGAC 
G'it;AAAAAC'i' 
CCGGAATGAA 
OGGA«flS5ll5G 
ATAGGAAGAT 
•1^<JCA1'CAG« 
CTTAAiGCAGC 
TAGCAAAGC'l' 
GTGrTGGAC^? 
AGAGAAAATC; 

GGCAGCCCAA 

•icrrGGGAAA 

GCACTTCCCA 
AGAU-rnC'JX^J* 
CTCCCTCTCT 
'IV^ACCATACl' 
GGGCAfiTCT?T 
AGGAA'J\^ri3 
fL^TGrG<;?CiTC 
CTCATATGTT 
CICACTTT^iAr. 
CCAACrATGGT 
GGGTOCCCCT 
GGCAGAGGTT 
AAACTCiCC.TC 
TCAAATTAAG 
■J^^ACICCCA 
TAGCAAGACC 
OCCTGTAiGPTC! 
GAGGCTGCAG 
CGTCC'iCAAA 
GaCCCATV3SAT 

a2uu:aaacaa 

A.WlC'J"J^l' 
f.:nCTAATTCA 
AGCTCA^iAri^ 
CPTYTTAAAAA 
CATCTTAAAG 
CACAhiOTCTC 
CTTAGGATCG 
GCCCTGAGAC 
TCTcrrcACAC 

AGGAGTTCAA 

A-meccGGO 

GAATC 



900 

9^0 

108.0 
1140 
1200 

i:?fci) 

132D 
1:3 ^M) 
1440 
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2340 

:!40o 

24611 

2SB0 
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(54) Title: APOPTOSIS MODULATORS THAT INTERACT WITH THE HUNTINGTON'S DISEASE GENE 
(57) Abstract 

A family of proteins, including a specific human protein designated as HIP 1, has been identified that interact differently with the 
gene product of a normal (16 CAG repeat) and an expanded (>44 CAG repeat) HD gene. Expression of the HIPl protein was found to be 
enriched in the brain. Analysis of the sequence of the HIPl protein indicated that it includes a death effector domain (DED). suggesting an 
apoptotic function. Thus, it appears that a nornial function of Huntingtin may be to bind HIPl and related apoptosis modulators, reducing its 
effectiveness in stimulating cell death. Since expanded huntingtin performs this function less well, there is an increase in HIPl -modulated 
cell deadi in individuals with an expanded repeat in the HD gene. This understanding of the likely role of huntingtin and HIPl or related 
proteins (collectively "HIP-apoptosis modulating proteins") in the pathology of Huntington's disease offers several possibilities for therapy. 
First, because the function of huntingtin apparenUy depends at least in part on the ability to interact with HIP-apoptosis modulating proteins, 
added expression (e.g.. via gene therapy) of normal (non-expanded) huntingtin or of the HIP-binding region of huntingtin should provide 
a therapeutic benefit. Other DED-interacting peptides could also be used to mask and reduce the interaction of HIP-apoptosis modulating 
proteins with the death signaling complex. Alternatively, a mutant form of HIP-protein from which the DED has been deleted might be 
introduced, for example using gene therapy techniques. Because HIP-apoptosis modulating proteins have been shown to self-associate, a 
protein with a deleted DED may compete with endogenous HIP-protein in the formation of these associations, thereby reducing the amount 
of apoptotically-active HIP-protein. 
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